Launching the Heroic Data Mining Blog


Since my last post at, a lot has happened in the world of analytics inside and outside Oracle. Big data and Big Data Analytics are hot topics now. Also, a number of technologies have flourished in the past few years in this space, to name a few: MapReduce, Hadoop, Mahout, Pig, R, and NoSQL database. As a result I felt a new blog was needed with a different theme and focus.

The oracledmt blog focused exclusively on Oracle technology. Because of all the new developments in Big Data Analytics this new blog will go beyond that. It will discuss and illustrate examples with the Oracle stack as well as other technologies. It will cover Big Data Analytics with a special focus on automation. It will also provide information on methodologies and design patterns for solving analytical problems. The effort with this new blog is to move the focus from experts, far too few, towards strategies that can be used by many developers and analysts. The main goal is to help empower developers to create smarter applications that can process, without failure, massive data sets with large number of attributes and adverse data quality. This is a heroic task indeed.

For those wondering, we have also been very busy in the area of analytics at Oracle since my last post. At the technology level we have introduced technologies such as: EndecaExalyticsExadataBig Data ApplianceIn-DB HadoopOracle R Enterprise, and Spatial and Graph analytics. More directly related to my group we have a number of developments:

Over these years I have also spent a great deal of time working on automating and increasing the ease of use of analytics and big data analytics. This is reflected in different aspects of Oracle Data Miner as well as server-side features like Dynamic Scoring. These ideas and features have also been applied in helping a number of Oracle applications to incorporate advanced analytics, for example:

I have migrated the content from the oracledmt blog to this new one. However, fixing some details in the posts is still a work in progress. I hope you enjoy the new look and content!

Posted on December 18, 2013 .

Funny YouTube Video Featuring Oracle Data Mining

Maybe I am too much of a data mining geek, but I found the video below to be funny. It also talks about a super cool feature ODM introduced in 11.2: the ability of scoring data mining models at the disk controller level in Exadata. This is a significant performance booster. It also makes it feasible to produce actionable insights from massive amounts of data extremely fast. More on this on a future post. For now enjoy the video.

Posted on February 18, 2010 .

Oracle Data Mining Races with America's Cup

For those that have not heard the BMW Oracle Racing team won the America's Cup sailing an incredible new boat. What even those that have been following the news on the race do not know is that Oracle Data Mining helped the performance team tune the boat.

I participated helping with that problem and it was a very hard one:

Posted on February 18, 2010 .

Data Mining Survey - Last Call

Rexer Analytics has just issued a last call for its annual data mining survey. This is a pretty nice survey that provides a great deal of valuable information about how data mining is used and who is doing it.

To participate, please click on the link below and enter the access code in the space provided. The survey should take approximately 20 minutes to complete.  At the end of the survey you can request to have the results sent to you as well as get a copy of last year's survey.

Survey Link:
Access Code: RS2008

Posted on March 24, 2009 .

Oracle BIWA Summit 2008

The Oracle BIWA Summit 2008 is approaching  (December 2-3) . It will be held at Oracle World HQ, Redwood Shores, California. This is the second event of its kind. Last year's event was a great success and lots of fun (see details here). This year's keynotes include Jeanne Harris (co-author of "Competing on Analytics") and Usama Fayyad (legendary data miner).  Here are some information and links about the event:

Posted on October 28, 2008 .

Collective Intelligence 1: Building a RSS Feed Archive

For a long time I have thought that we needed data mining books written for developers. Most data mining books are written for business or data analysts. Given that, it was a pleasant surprise to read Programming Collective Intelligence: Building Smart Web 2.0 Applications by Toby Segaran. The book provides a good discussion on data mining concepts anchored with interesting examples. It also provides guidance on when to use the techniques. I still think that there is room for improvement on how data mining and analytics should be presented to developers. However, the book is a great step forward in enabling developers to use analytics.

Posted on September 8, 2008 .

Data Mining in Action: Oracle Sales Prospector

I firmly believe that a major trend in applications is the incorporation of analytic-enabled functionality. Users want more than just reports or a replay of the past. Users want to have insights and their attention directed to key points. This is where analytics can make a big impact across all types of applications. Notice that I am not proposing exposing analytical capabilities (e.g., data mining and statistical modeling) to users. That, I think, is only effective for a small number of users. What I believe is more meaningful is to present functionality and information to users that are the result of analytics taking place behind the scene.

In line with this trend, Oracle has recently released a new application: Oracle Sales Prospector. This application is targeted at sales representatives. It provides which accounts they should target with specific products or services. It also indicates which deals are most likely to close within specific time frames, and provides accompanying corporate profiles as well as likely customer references.

Posted on August 22, 2008 .