Say hello to “Lisa” the most impressive customer care officer. None of AI can compete.

ai_

I recently called my bank and went through ten minutes of waiting on the phone, punching multiple keys, iterative menu and dialing again. I was stupid enough to identify in which category my request should be placed, and customer care AI system was efficient enough to disconnect a client not fully oriented with the bank voice menu. After few, attempt the AI-powered system gave on my intelligence and connected to Lisa.

What a relief, a truly advanced system that can understand emotions, answer my open-ended questions, have no issue with my lingo and finished my request faster and recommended a new product which I gladly accepted as it was fun to interact. Truly impressive customer care. Lisa is not the next generation of humanoid but human itself. Say hello to Lisa the most exceptional customer care officer.

In a genuinely democratic world where few vocals are re-writing the history and being neutral is a Sin as evident from recent US election ( http://brilliantmaps.com/did-not-vote/  ).  I want to ensure I am doing my job to set the right priority for myself and the fellow professionals, being myself an ML, AI, and big data evangelist. I see a lot of conferences where professionals and startup pride in replacing normal human interaction with a robot or AI-powered system. While this may sound cool, it certainly doesn’t make a business sense. Consider a BankA which replaces human interaction with the NLP-based engine to respond to customers may be saving millions of dollars.  This saving is diverted toward improving brand recognition, loyalty and reach out to potential buyers.  Now imagine another BankB which employs a mass, each employee brings new customers consistently & effortlessly due to their network and relationship. The enhanced customer experience becomes a “Brand” for itself, and the customer remains loyal irrespective of promotional offering from XYZ banks.  A Happy employee, happy clients, makes the world a better place to live.

There are problems that human race has been struggling since generations such as poverty, food crisis, natural disaster, drinking water availability, healthcare, education.  And we have new ones such as cyber security, abuse of social media, fraud and terrorism, efficient transportation in rural areas and a lot of “big Questions” that can be answered by Big data.  As a professional I prioritize and support projects such as

As IBM Watson Machine Learning, Microsoft Azure ML and Amazon ML aim to simplify ML and empower more professional it’s time to emphasize on the first and the most important phase, and that’s the phase where you ask the question and you specify what is it that you’re interested in learning from data. “what question we are asking “.

F1, Spark and Blu-ray Player

What does F1, Spark and Blu-Ray player have in common?  Before you start browsing your “intellectual” thoughts, let me state the fact, it just happens to be few regular events in last week. I hosted a session on Apache Spark, attended the SIA F1 weekend event and picked up my free “Blu-Ray” Player (gift with my TV upgrade).
Back to work, and while reflecting on various advises and feedback on Big Data Analytic Deployments, it seems these dis-joint events actually represents today Analytics ecosystem and processes.
While F1 represent the ultimate Speed and Agility. Apache Spark promises to bring the same “Speed and Agility” to Big Data analytics.
F1 relies on discipline and rigor, and the key to winning is to adjust, adapt, and realign during the race (execution) itself. Big Data Analytics success factors are same. It’s not about starting with a KPI driven big-bang and rigid data governance approach. The key to “Big Data” Analytic is to start with a minimal investment, business aligned focused goal and adjusting, adapting and re-aligning during the development life-cycle itself.
Apache Spark promises great Agility in terms of Big Data Analytic development life-cycle. It provides ability to create complete data science workflow, ingest, transform, prepare data, execute analytic algorithm, analyze and visualize all on a single Platform. A unified Platform for such development allow to rapidly adjust, adapt and re-align and thus promises to provides Business with Insights and Agility they have been seeking.
What about the speed? While SPARK hold the record for quickly sorting 100 TB of data (1 trillion records) , its improving similar to Mercedes engine for F1 cars by each release.

image01

What about the “Blu-Ray” Player ?  While the “Blu-Ray” Player is one of the excellent technology,I have been struggling to understand it relevance in my house. I watch Movies on Apple Tv , its agile ( I can decide at any time what to watch, change my preference , pay and enjoy). I use USB-Drive/external Drive for any of my existing content. I don’t see a reason why I should be paying for costly “Blu-ray” Disc, which forces me to limit my choice and loose flexibility.

The last statement just reflects the comment I have been hearing from Business Leaders about the value they see from there “traditional data warehouse” approach.
Add to this, the Disc and Blu-ray Region code map, (data governance going wrong) which again limits what I can play, its excellent technology but today irrelevance to me.

So what’s represents yours Analytic Ecosystem?  “Speed and Agility” or “High Cost and Rigid” ?

Data Lake for the Enterprise: Using Elasticity of Cloud and Safety of In-premises deployment

We experienced this on every Analytics/Data Warehousing  project that a disproportionate portion of the time spent is about data preparation i.e acquiring /preparing /formatting/normalizing the data. As as the use of Analytics mature in an organisation, dimensional modelling and KPI based reporting are becoming old-fashioned and of limited use.  Subject matter experts want access to their organisation’s data to explore the content, select, control, annotate and access information using their terminology with a frameowrk of data protection and governance.  We saw the conflict between business and IT operations for democratisation of data  and operational control. Thus the concept of Data Lake started evolving.

Data lake as understood by most of the enterprise is a  big data repository that provides data to an organization for a variety of analytics processing including:

  • Discovery and Exploration of Data.
  • Simple Ad Hoc Analytics.
  • Complex analysis for business decisions.
  • Real time Analytics & Reporting.

It is possible to deploy analytics into the data lake to generate additional insight from the data loaded into the data reservoir. There are two aspect of building an effective Data Lake, Platform Infrastructure and data flow ( includes governance and control aspect). While am not touching on the data flow here, would just like to caution that just considering a very structured multi-dimensional modelling on a data lake itself takes away the objective of data lake.

So on the platform aspect for a data lake, Hadoop is becoming synonymous  with data lake project and in fact evolved due to it. Most of the vendor today either have a cloud solution for Hadoop or an appliance based approach. While Cloud provides elasticity it’s still a challenge for most of the enterprise to move all sorts of data freely on cloud. In-premise Appliance based approach provide simplicity however looses the elasticity required for a data lake.

Here I suggest a hybrid model which provides both elasticity as well as safety net required for most of the enterprise. A small hadoop cluster in-premises provides safety to collect and store all “data asset” specifically from your in-house applications. It can work as a staging area for volumes of data and analytics area for your confidential data. Can avail cloud hadoop deployment for massive transformation and deeper analytical algorithm and scale (shrink or grow as required).  In order to maintain the integrity and data security you need a secure data movement, encryption and access control.

Here is a suggested Architecture based on IBM Big data portfolio.

Hybrid Enterprise Data Lake Architecture

Hybrid Enterprise Data Lake Architecture

IBM Reference Architecture for BigInsight( hadoop ) provides the flexibility to deploy secure in-premise cluster of any size and cloud offering for BigInsight ( either as PaaS, SaaS, IaaS) helps to provide elasticity.

The above architecture really opens up un-explored opportunities without having to deal with a big time initial investment, changing business dynamics & analytics requirement , foremost without compromising on the control. Business can choose to experiment with setting up a data lake with a in-premise cluster , and let the usage of analytics evolve. As and when they are ready can provision a cloud for any specific analysis they prefer to do.  ( Example Click data analysis for traffic generated on the web for retailers during the holiday season or multi-facet clustering for a new promotion by a  financial hub).

While the SQL on Hadoop has evolved for structured data , use of existing MPP/In-memory technologies as accelerator for operational reporting is optional.  Will publish my thoughts on data flow and governance model for data lake in my next blog.

Most convincing Big Data Use cases in Enterprise for 2013

Most convincing Big Data Use cases in Enterprise

Since last 12 months visited around 10 countries and had several web-conference with established enterprises around the world.( Except America and Europe).  Everyone has been eager to understand big data and start journey with the most profitable use cases. However  all enthusiasm doesn’t kicks up a big data project as the toughest job has been to identify relevant use cases that would make them profitable. Only few customers consider big data for optimizing operational cost, most of them tend to explore how they can create a new service, products, or optimize customer experience using Big Data technologies. Here are the top 5 use cases that were most convincing.

  1. URL Log, xDR , IPDR Analysis  by Telco.
  2. Data warehouse Augmentation for Bankers.
  3. Network Inspection, Security and Audit across verticals.
  4. Product Quality /Defect Tracking for Manufacturing.
  5. Search across several terabytes of information.

Well the most discussed use case has been Social Media Analytics for understanding customer buying pattern, sentiment analytics. However after few rounds of discussion it becomes challenging to justify the tangible benefit and ROI. Also considering the data availability, veracity and availability of point tools and lack of Strategy around social media made this a challenging one to get started.