Some times R gets a bad deal.
- R has come a long way and it is a powerful functional programming platform, with print ready visualization tool where you can do anything; the limit is your hard work and imagination
- R is a variation of S the free version when the original seed, S, became S + as a commercial product.
- You can do anything in R as much as you can do anything with C or C++
- R is continually supported by a large group of practitioners and you will get help for free
- One version of R become Revolution R with some special purpose packages which was sold as enterprise edition, another branch of R, and got bought out by Microsoft
- One of the rock stars of R is Hadley Wicham, who is continually making great contributions improving R
- For daily life of a data science practitioner, there is no pride in saying you do not know R, even if you know Python.
- Here is a copy of advanced R by Hadley Wickham: https://englianhu.files.wordpress.com/2016/05/advanced-r.pdf (not sure it is truly free or someone leaked it out); I am buying the real book. - R BEATS PYTHON! R BEATS JULIA! ANYONE ELSE WANNA CHALLENGE R?
- It is neither humbling nor a pride statement to say that a data scientist says that he/she does not know R. It is an important tool in a data scientist's tool box.
- Many complain about R that it is difficult; more complained about SAS and continue to do so, but SAS's power is well known.
- War of languages
Another war of languages about R/Python/Octave/Matlab/Julia - . One more: https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/
- The minimalist approach for programming languages for a data scientist from my perspective is R/Python/SAS. Until proven, do not waste your time.
This summary is to provide quick review of text analytics, and some key references
To learn rich and deeper levels of analysis and its resources, we will go beyond the scope of bag of words approach.
The idea of sentiment analysis is identifying polarity of the moods. We train the moods and model them as supervised algorithm. The moods are defined by multiple words and the frequency of occurrences of already known representations in words or phrases of such polarity that represents the culture are used to train the model, as a supervised learning.
The common methods and an example application area:
Polarity analysis: Analysis of reviews is an important application here. Even though this contains latent class ideas, the well defined polarized independent variables (words) helps us to avoid the challenges of deep latent variable challenges
- Subjectivity/objectivity identification: Classifying corpus into one of subjective or objective content. This has implied challenges because the collection of words that defines the subjective/objective differentiation are themselves influenced by the classification we intend to do, This is because the polarity of subjectivity/objectivity differences possibly are not well defined.
- Features/Aspects analysis:
This offers analysis of what matters most on sub-areas of polarity identification. For example, in a review of a restaurant, the food is great but service and hygiene could be better. Here we use neighboring phrases as important contributor for sub-polarities.
The methods combine the knowledge base of meaning of sentiments/likeability and objective/subjective interpretations, statistical and machine learning approaches, and grammar and culture in a language; the last topic is the most difficult one to achieve a human level proficiency, because the levels of depth in the nature of usage of words can involve many levels of latent intelligence and their interactions.
Some key references:
RTextTools: A Supervised Learning Package for Text Classification
Introduction to text analysis with an application to cluster analysis
Twitter analysis for stock market sentiment
Measuring the Happiness of Large-Scale Written Expression: Songs, Blogs, and Presidents
Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment
https://www.google.com/patents/US20090282019 - A patent on how to implement text analytics server
The author of XGBOOST package is presenting why and how to use xgboosting
Business model is the central nervous system of running a company whose function is to coordinate all entrepreneurial support systems for the sole purpose of achieving growth strategies where the right customer is willing to pay right price for the right products/services.
Business model is a creative intelligent way of solving your business problem of growth, sustainability, and selling the buy to the consumer; having the greatest product/services alone is not enough. It is not just super marketing strategy. It is also pricing strategy, it is also the product priority strategy; also of segment strategy; also of strategizing timing and priorities of the markets and consumers.
It is often difficult to develop 10-100 million dollar products. Often products as a start-up strategy will lead to likelihoods of loss more often and more quickly. It is much easier to develop a 5-20 million-dollar service company, compared to a 10-100 million product strategy. The bigger the vision to develop, the more prominent are the absolute loss due to product failures.
As to why most products fail, read, https://hbr.org/2011/04/why-most-product-launches-fail
None the less whether product or service, there are so many ways an organization can compete in the market place intelligently
using a business model.
For a lucid comparative discussion of 24 examples with 24 different business models/pricing strategies, see http://www.dummies.com/education/college/examples-of-business-models/
- A John Wiley resource
The Yin-Yang of Business Models: Discovery Process and Execution for excellence
The importance of business model and the need to be innovating regarding the business model in the market place does not stop even for a well-established company, though it is often identified as a critical success factor as well as a critical lift factor of a strategic KPI of start-ups. With such an approach it is easier to see multi-fold lifts due to innovative disruptive business models among the start-ups.
The yin-yang of business model discovery and business model execution and how to achieve the discovery process quickly is explained by Stefan Gross-Selbeck: Business model innovation
Traditionally, a start-up is different from well established companies.
A start up is trying to get all its support functions working smoothly and contributing to making money; it is at the same time figuring out how to do it efficiently so that it can sustain and grow, even if the margins fall; how to keep the differentiation and uniqueness as a competitive force so that newer competitions do not take away its ability to sustain and growth - Doing all with the focus of making money and at the same time, growing and sustaining in its cycles of daily operations.
A start-up is always trying to figure out its business model; it is on the discovery mode, until it becomes a well-established company where the threat of its existence is not any more a serious problem, but its sustained growth may be its problem.
On the other hand, a well-established company has been filtering out of those situations over time, and it becomes well established; mostly it is trying to optimize the resources, inputs, budget to keep sustainability and growth. A well-established organization has figured it out already, optimized the various components of how they all come together, and it is executing its initial model that sustained and grew into more a successful continuation model, and where possible, continue to keep growth in a modest way for its stake holders.
In a brief one-word comparison, they are two different worlds of focus on what would make it a success, sustain, and become established success. They are discovery vs. execution.
A big problem is not every company can work on both sides, and both are needed. Start-ups or well established companies, if companies were to continue to thrive, grow in the market place, they all have to have the mind-set of a startup, and yet execute on a daily basis. If a well-established company is not facing the disruptive strategic business model from its competition, head-on, it will become irrelevant and loose its market to its competition. So There is a systematic way to have a start-up mindset and execute on a daily basis, and the key functioning characteristics of such a mind-set are:
- Attacker mindset to win
- Agile mindset to win
- Failure is ok mindset to win
So, in essence you may say that it is that the “Discovery” and “Execution” are two parts of yin-yang of business model.
A comparison of product vs. services start-up.
I want to bring out why to choose one over the other before I go for kindling your thoughts on how to execute one over the other with a plan to move from services to product.
We know something is a service (and hence the lack of it as products) when we see the following attributes
For more extensive details on how to identify something as service(and hence something is not a service) here is a list from a reference - reference at the bottom.
For a detailed executive summary see http://www.mckinsey.com/~/media/McKinsey/Business%20Functions/Business%20Technology/Our%20Insights/Disruptive%20technologies/MGI_Disruptive_technologies_Executive_summary_May2013.ashx
A short meta summary is provided here:
Changes in technologies is always happening due to competitiveness. There is always hope and excitement when a new technology develops and to top it, the business community looks for opportunities to re-adjust their strategies, and reposition their competitiveness in light of the new developments in technologies. There are certain technologies that exhibit early explosive successes indicating potential to change the business structures of the world, enhance the quality of lifestyle of people in 10x type form factor, and bring new significant opportunities in untapped areas, and expands the level playing field for all the segments of the world population. These are considered as disruptive technologies, and in the words of Pete Schumpeter, these lead organizations to “creative destruction” resulting in new industries, new products/services, new production methods, new applications, new lifestyles, bulldozing incumbent businesses with newer ones, resulting in business processes, new business measures, new data, new type of figuring out insights using the new types of data. We have seen early discussions of such topics as big data in the last 6 years. Also, the economic impact of these technologies has the range of most possible certainty of $16.7T to possible maximum value of $33T, over shadowing all the previous impacts of all the developments the history of the world has seen.
In this document McKinsey brings out twelve such disruptive technologies, of which the central technology that in fact builds/supports/empowers all the other 11 technologies, is the “Automation of Knowledge Management”, the one that has the maximum of the most certain value, with an economic value of $5.2T, among a total of certainty value of $16.7. That is close to one third of the most certainty economic value that is possible due to all the other 11 disruptive technologies. The exciting thing is these economic value is realizable in the next 10 years.
This course is uniquely positioned to address the entrepreneurial opportunities that are coming together in the space of data, analytics, and technology, as a platform for knowledge management.
Some key high lights of the knowledge management are:
- This is all made possible because of the enormous computing power achieved in the last 30 years, a 100 fold increase in the sophistication and the variety of applications. The iphone 4c is one such a model compared to the computer used in the Apollo 11 lunar module! This makes it possible in the creation of all types of new data as streams, a continuous flow of data resulting in lots of unstructured data, besides structured data, for which we are yet discover efficient and systematic ways of collecting, analyzing, and applying the insights.
- There will be nearly 230 Million knowledge workers, 9% of the world work force who will be engaged by this disruptive industry
- The economic impact of these 230 million people will cost nearly $9T, which is one fourth of the global employment costs, indicating the importance of HR/people analytics
- This will change the nature of work, creates opportunities for entrepreneurs, help create new products and services, changes organizational structures, changes comparative advantages for nations, affects employment, poses new regulatory and legal changes.
Let us get clarity, explore possibilities, and venture in unexplored areas of opportunities.