Why LinkedIn and Twitter are up for grabs in 2012-14?

Given Facebook’s valuation at $60-$100 billion , Apple’s $100 billion cash pile, Microsoft’s cash of $ 52 billion, Google’s cash of 43 billion $ , there is a lot of money floating. I am not counting Amazon as it deals with its own Fire issues.

But what is left to buy. In terms of richness of data available for data mining for better advertising- it is Twitter and LinkedIn that have the best sources of data.

and LinkedIn is worth only 9 billion dollars and Twitter is only $8.5 billion dollars. Throw in a competitive dynamic  premium, and you can get 50 % of both these companies at 13 billion dollars. if owners dont want to sell 100%, well buy a big big stake.

Makes a good case- buy the company- buy the data- sell them ads- sell them better products.

What do you think?

Third Party Extensions to WPS

WPS, which uses the language of SAS, and is much more moderately priced- recently announced Version 3

As per Minequest-http://www.minequest.com/

Prices start from $1206.

WPS  is a SAS language alternative application that can run most of your existing Base
programs without modification. WPS also provides support for a limited subset of
graphics via WPS Graphing and statistical procedures via WPS Statistics.

In addition to the broad language support, WPS is affordably priced and much less
restrictive in its licensing. WPS supports many elements of the language including
the macro language facility, the data step, most of the PROCS found in base, and
includes access engines to ODBC, DB/2, Oracle, MySQL, OLE DB, Informix,
GreenPlum, Sybase, Teradata, SPSS and other database systems.

WPS has a feature that allows you to develop extensions for it.

http://teamwpc.co.uk/products/wps/modules/language_sdk

Develop Custom Language Items

Anyone who is familiar with traditional programming languages such as Assembler, C or C++ can use the WPS Language SDK developer module to add language items to extend the language of SAS support in WPS.

Once you have created your own custom language items, you can freely redistribute them to anybody who uses WPS on the same platform.

Some existing third Party extensions are-

http://www.minequest.com/Bridge2R.html

Bridge to R for WPS
The Bridge to R is an application created and developed by MineQuest that allows the WPS
programmer to access the R system. With the Bridge to R, WPS Users can write and execute R code
from within the WPS Workbench or in batch mode using WPS. A simple yet powerful integration facility
allows the WPS developer to write R statements to access advanced features found in R, test
algorithms and create new statistical methodologies.

One of the greatest advantages of the Bridge to R is that you and your developers can use the SAS
language for manipulating and reporting your data and use R for advanced graphics and statistical
computing. The Bridge to R allows WPS Users to write R code and execute R code from within the
WPS Workbench and receive the results back in the appropriate log and listing window. The Bridge to
R even supports running your WPS and R programs in batch for production purposes.

It allows you to use either 32-bit or 64-bit R depending on whether your OS platform supports 64-bit computing. On 64-bit  platforms with sufficient RAM, developers can process large amounts of data using R.

The Bridge to R requires WPS 2.5 or higher. The Bridge is provided as part of the Analyst Power Package
and is included when you license WPS from MineQuest.

http://www.minequest.com/WindowsPowerPack.html

Parallel Process with MPExec
MPExec allows the WPS developer to dramatically reduce processing time by easily implementing parallel processing on their Windows workstation or  server. MPExec allows WPS developers to take existing WPS/SAS code and by inserting a few statements, allow your programs to access all the cores  and resources on your Windows hardware platform. MPExec allows you to parallel process up to 255 tasks and manages the collection of log and listing  files for you.

Other 3rd party applications are-

http://www.uniqcus.com/english/products.html

UniQZConnect

UniQZConnect will allow your SAS and WPS users to have a SAS or WPS session on a Windows Workstation to connect directly to SAS or WPS on z/OS. This give the user the ability to download and upload data from and to z/OS, but also to submit SAS jobs for remote execution.

The product support Wizard user dialog, full script interface, and a mixture of theese.

The Performance of the product is only limited by your current mainframe bottle necks and the current transfer limits of your FTP connection. The Security is handled through FTP ensuring that you are in compliance with your current security policies.

You can download a product sheet for UniQZConnect here.

and an upcoming application from Wolfgang at http://www.wcmat.com/cmat/

Things ToDo
CMAT was first written for Unix and later for Windows. We are now working on a 32 bit version for Linux and Unix, and after that on a 64 bit version for Windows, Linux, and Unix. We are also working on an interface to WPS and R.

—–

Overall creating applications or extensions can help WPS reach a wider audience. While Rapid Miner also has an marketplace for extensions and JMP also has extensions, one critical feature in statistical computing development is where coders can finally earn some money by creating algorithms, packages and extensions (atleast to compare with game creators on mobiles!!)

 

 

 

Statistical Theory for High Performance Analytics

A thing that strikes me when I was a student of statistics is that most theories of sampling, testing of hypothesis and modeling were built in an age where data was predominantly insufficient, computation was inherently manual and results of tests aimed at large enough differences.

I look now at the explosion of data, at the cloud computing enabled processing power on demand, and competitive dynamics of businesses to venture out my opinion-

1) We now have large , even excess data than we had before for statisticians a generation ago.

2) We now have extremely powerful computing devices, provided we can process our algorithms in parallel.

3) Even a slight uptick in modeling efficiency or mild uptick in business insight can provide huge monetary savings.

Call it High Performance Analytics or Big Data or Cloud Computing- are we sure statisticians are creating enough mathematical theory or are we just taking it easy in our statistics classrooms only to be subjected to something completely different when we hit the analytics workplace.

Do we  need more theorists as well? Is there ANY incentive for corporations with private R and D research teams to share their latest cutting edge theoretical work outside their corporate silo.

 

Related-

“a mathematician is a machine for turning coffee into theorems

Oracle launches its version of R #rstats

From-

http://www.oracle.com/us/corporate/press/1515738

Integrates R Statistical Programming Language into Oracle Database 11g

News Facts

Oracle today announced the availability of Oracle Advanced Analytics, a new option for Oracle Database 11g that bundles Oracle R Enterprise together with Oracle Data Mining.
Oracle R Enterprise delivers enterprise class performance for users of the R statistical programming language, increasing the scale of data that can be analyzed by orders of magnitude using Oracle Database 11g.
R has attracted over two million users since its introduction in 1995, and Oracle R Enterprise dramatically advances capability for R users. Their existing R development skills, tools, and scripts can now also run transparently, and scale against data stored in Oracle Database 11g.
Customer testing of Oracle R Enterprise for Big Data analytics on Oracle Exadata has shown up to 100x increase in performance in comparison to their current environment.
Oracle Data Mining, now part of Oracle Advanced Analytics, helps enable customers to easily build and deploy predictive analytic applications that help deliver new insights into business performance.
Oracle Advanced Analytics, in conjunction with Oracle Big Data ApplianceOracle Exadata Database Machine and Oracle Exalytics In-Memory Machine, delivers the industry’s most integrated and comprehensive platform for Big Data analytics.

Comprehensive In-Database Platform for Advanced Analytics

Oracle Advanced Analytics brings analytic algorithms to data stored in Oracle Database 11g and Oracle Exadata as opposed to the traditional approach of extracting data to laptops or specialized servers.
With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.
By providing direct and controlled access to data stored in Oracle Database 11g, customers can accelerate data analyst productivity while maintaining data security throughout the enterprise.
Powered by decades of Oracle Database innovation, Oracle R Enterprise helps enable analysts to run a variety of sophisticated numerical techniques on billion row data sets in a matter of seconds making iterative, speed of thought, and high-quality numerical analysis on Big Data practical.
Oracle R Enterprise drastically reduces the time to deploy models by eliminating the need to translate the models to other languages before they can be deployed in production.
Oracle R Enterprise integrates the extensive set of Oracle Database data mining algorithms, analytics, and access to Oracle OLAP cubes into the R language for transparent use by R users.
Oracle Data Mining provides an extensive set of in-database data mining algorithms that solve a wide range of business problems. These predictive models can be deployed in Oracle Database 11g and use Oracle Exadata Smart Scan to rapidly score huge volumes of data.
The tight integration between R, Oracle Database 11g, and Hadoop enables R users to write one R script that can run in three different environments: a laptop running open source R, Hadoop running with Oracle Big Data Connectors, and Oracle Database 11g.
Oracle provides single vendor support for the entire Big Data platform spanning the hardware stack, operating system, open source R, Oracle R Enterprise and Oracle Database 11g.
To enable easy enterprise-wide Big Data analysis, results from Oracle Advanced Analytics can be viewed from Oracle Business Intelligence Foundation Suite and Oracle Exalytics In-Memory Machine.

Supporting Quotes

“Oracle is committed to meeting the challenges of Big Data analytics. By building upon the analytical depth of Oracle SQL, Oracle Data Mining and the R environment, Oracle is delivering a scalable and secure Big Data platform to help our customers solve the toughest analytics problems,” said Andrew Mendelsohn, senior vice president, Oracle Server Technologies.
“We work with leading edge customers who rely on us to deliver better BI from their Oracle Databases. The new Oracle R Enterprise functionality allows us to perform deep analytics on Big Data stored in Oracle Databases. By leveraging R and its library of open source contributed CRAN packages combined with the power and scalability of Oracle Database 11g, we can now do that,” said Mark Rittman, co-founder, Rittman Mead.
Oracle Advanced Analytics — an option to Oracle Database 11g Enterprise Edition – extends the database into a comprehensive advanced analytics platform through two major components: Oracle R Enterprise and Oracle Data Mining. With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.

Oracle R Enterprise tightly integrates the open source R programming language with the database to further extend the database with Rs library of statistical functionality, and pushes down computations to the database. Oracle R Enterprise dramatically advances the capability for R users, and allows them to use their existing R development skills and tools, and scripts can now also run transparently and scale against data stored in Oracle Database 11g.

Oracle Data Mining provides powerful data mining algorithms that run as native SQL functions for in-database model building and model deployment. It can be accessed through the SQL Developer extension Oracle Data Miner to build, evaluate, share and deploy predictive analytics methodologies. At the same time the high-performance Oracle-specific data mining algorithms are accessible from R.

BENEFITS

  • Scalability—Allows customers to easily scale analytics as data volume increases by bringing the algorithms to where the data resides – in the database
  • Performance—With analytical operations performed in the database, R users can take advantage of the extreme performance of Oracle Exadata
  • Security—Provides data analysts with direct but controlled access to data in Oracle Database 11g, accelerating data analyst productivity while maintaining data security
  • Save Time and Money—Lowers overall TCO for data analysis by eliminating data movement and shortening the time it takes to transform “raw data” into “actionable information”
Oracle R Hadoop Connector Gives R users high performance native access to Hadoop Distributed File System (HDFS) and MapReduce programming framework.
This is a  R package
From the datasheet at

Exciting Contest at CrowdANALYTIX

A new contest from a relatively new website. This one is fast and furious and has a decent chunk of money!

 

From

 

http://www.crowdanalytix.com/contests/airport-guest-sentiment-analysis-1544282253/view/

 


Sun, 26 February 2012 05:00 AM UTC

Mon, 05 March 2012 05:00 AM UTC
 Text Analytics  Aerospace & Aviation

 

Analysis of sentiment and its intensity – feedback from airport guests
ABC (name intentionally obfuscated) is one of the best managed and highly profitable airports  in India. As with all well managed airports, ABC would like to understand what guests feel about their experience when traveling, using or transiting through their airport. ABC has a website in which guests can visit and leave behind a comment, agree or disagree with others’ comments, or respond to a comment confirming or negating the expressed opinion.

The goal of this contest is to create a summarization of the opinions, feelings and sentiments expressed in the comments left behind by guests on the website. This information is being provided as data for solvers. Some understanding of the intensity of the opinion, feeling or sentiment will also be useful. For example, if there is a consistent demand for more spas across guest conversations, it needs to be highlighted.  Consistent positive or negative sentiments and opinions need to be discovered and highlighted.
Data
Guest comments have been crawled and provided to you. The data consists approximately 1000 comments from guests including the timestamp of those comments.  Personal information (name, email etc) have been hidden. This data is publicly available
Solver Expectations:
Participants may submit entries before the deadline. If a participant submits multiple entries, the entry submitted last before the deadline will be considered as the participant’s submission.
The following deliverables are expected to be submitted:
  • a report expressing the results of the opinion and sentiment mining
  • documentation about how you approached the problem, what tools, technologies and languages you used, and what problems you encountered
Timeline and Prizes: 
This contest begin on 16 Feb 2012 and will last for a duration of 9 days.
Prizes:
  • One 1st Prize – $1000
  • One 2nd Prize – $500
  • Two 3rd Prizes – $250