Artificial Intelligence (AI) will soon be at the heart of every major technological system in the world to manage and access your mission critical data.
Just a few uses are cyber and homeland security, anti-money laundering, payments, financial markets, biotech, healthcare, marketing, natural language processing (NLP), computer vision, electrical grids, nuclear power plants, air traffic control, and Internet of Things (IoT).
While Artificial Intelligence is becoming a major staple of technology, few people understand the benefits and shortcomings of A.I. and Machine Learning technologies.
Machine learning is the science of getting computers to act without being explicitly programmed. Machine learning is applied in various fields such as computer vision, speech recognition, NLP, web search, biotech, risk management, cyber security, and many others.
The machine learning paradigm can be viewed as “programming by example”. Two types of learning are commonly used: supervised and unsupervised. In supervised learning, a collection of labeled patterns is provided, and the learning process is measured by the quality of labeling a newly encountered pattern. The labeled patterns are used to learn the descriptions of classes which in turn are used to label a new pattern. In the case of unsupervised learning, the problem is to group a given collection of unlabeled patterns into meaningful categories.
Within supervised learning, there are two different types of labels: classification and regression. In classification learning, the goal is to categorize objects into fixed specific categories. Regression learning, on the other hand, tries to predict a real value. For instance, we may wish to predict changes in the price of a stock and both methods can be applied to derive insights. The classification method is used to determine if the stock price will rise or fall, and the regression method is used to predict how much the stock will increase or decrease.
Brighterion’s white paper, Artificial Intelligence and Machine Learning: The Next Generation, explains the difference between many of these legacy technologies and the advances in AI and ML. Here’s a brief summary:
Traditional logic typically categorizes information into binary patterns such as, black/white, yes/no, or true/false. Fuzzy logic brings a middle ground where statements can be partially true and partially false to account for much of day-to-day human reasoning. For example, stating that a tall person is over 6′ 2″ traditionally means that people under 6′2″ are not tall. If a person is nearly 6′ 2″, then common sense says the person is also somewhat tall. Boolean logic states a person is either tall or short and allows no middle ground, while fuzzy logic allows different interpretations for varying degrees of height.
Neural networks, data mining, CBR, and business rules can benefit from fuzzy logic. For example, fuzzy logic can be used in CBR to automatically cluster information into categories which improve performance by decreasing sensitivity to noise and outliers. Fuzzy logic also allows business rule experts to write more powerful rules. Here is an example of a rule that has been rewritten to leverage fuzzy logic:
When the number of cross border transactions is high and the transaction occurs in the evening then the transaction may be suspicious.
Business rules management system
A business rules management system (BRMS) enables companies to easily define, deploy, monitor, and maintain new regulations, procedures, policies, market opportunities, and workflows. One of the main advantages of business rules is that they can be written by business analysts without the need of IT resources.
Business rules represent policies, procedures, and constraints regarding how an enterprise conducts business. Business rules can, for example, focus on the policies of the organization for considering a transaction as suspicious. A fraud expert writes rules to detect suspicious transactions. However, the same rules will also be used to monitor customers whose unique spending behavior are not accounted for properly in the rule set and this results in poor detection rates and high false positives. Additionally, risk systems based only on rules detect anomalous behavior associated with just the existing rules; they cannot identify new anomalies which can occur daily. As a result, systems based on rules are outdated almost as soon as they are implemented.
A neural network is a technology loosely inspired by the structure of the brain. A neural network consists of many simple elements called artificial neurons, each producing a sequence of activations. The elements used in a neural network are far simpler than biological neurons. The number of elements and their interconnections are orders of magnitude fewer than the number of neurons and synapses in the human brain.
Backpropagation (BP) is the most popular supervised neural network learning algorithm. BP is organized into layers and connections between the layers. The leftmost layer is called the input layer. The rightmost, or output, layer contains the output neurons. Finally, the middle layers are called hidden layers. While the design of the input and output layers of a neural network is straightforward, there is an art to the design of the hidden layers. Designing and training a neural network requires choosing the number and types of nodes, layers, learning rates, training data, and test sets.
Recently deep learning, a new term that describes a set of algorithms that use a neural network as an underlying architecture, has generated many headlines. Deep neural networks learn hierarchical layers of representation from the input to perform pattern recognition. When the problem exhibits non-linear properties, deep networks are computationally more attractive than classical neural networks. A deep network can be viewed as a program in which the functions computed by the lower-layered neurons are subroutines. These subroutines are reused many times in the computation of the ﬁnal program.
Although deep learning garners much attention, people fail to realize that deep learning has inherent restrictions which limit its application and effectiveness in many industries and fields as it requires human expertise and significant time to design and train.
Data mining, or knowledge discovery in databases, is the nontrivial extraction of implicit, previously unknown and potentially useful information from data. Statistical methods are used that enable trends and other relationships to be identified in large databases.
Data mining has attracted attention is because of the wide availability of vast amounts of data, and the need for turning such data into useful information and knowledge. The knowledge gained can be used for applications ranging from risk monitoring, business management, production control, market analysis, engineering, and science exploration.
Case -based reasoning
Case-based reasoning (CBR) is a problem solving paradigm that is different from other major A.I. approaches. CBR learns from past experiences to solve new problems. Rather than relying on a domain expert to write the rules or make associations along generalized relationships between problem descriptors and conclusions, a CBR system learns from previous experience in the same way a physician learns from his patients. A CBR system will create generic cases based on the diagnosis and treatment of previous patients to determine the disease and treatment for a new patient. The implementation of a CBR system consists of identifying relevant case features. A CBR system continually learns from each new situation. Generalized cases can provide explanations that are richer than explanations generated by chains of rules. The most important limitations relate to how cases are efficiently represented, how indexes are created and how individual cases are generalized.
Genetic algorithms work by simulating the logic of Darwinian selection where only the best performers are selected for reproduction. Over many generations, natural populations evolve according to the principles of natural selection. A genetic algorithm can be thought of as a population of individuals represented by chromosomes. In computing terms, a genetic algorithm implements the model of computation by having arrays of bits or characters (binary string) to represent the chromosomes. Each string represents a potential solution. The genetic algorithm then manipulates the most promising chromosomes searching for improved solutions. A genetic algorithm operates through a cycle of three stages:
- Build and maintain a population of solutions to a problem
- Choose the better solutions for recombination with each other
- Use their offspring to replace poorer solutions.
Genetic algorithms provide various benefits to existing machine learning technologies such as being able to be used by data mining for the field/attribute selection, and can be combined with neural networks to determine optimal weights and architecture.
Next generation, artificial intelligence and machine learning
We’ve seen that current A.I. and machine learning technologies suffer from various limits. Most importantly, they lack the capacity for:
Personalization: To successfully protect and serve customers, employees, and audiences we must know them by their unique and individual behavior over time and not by static, generic categorization.
Adaptability: Relying on models based only on historical data or expert rules are inefficient as new trends and behaviors arise daily.
Self-learning: An intelligent system should learn overtime from every activity associated to each specific entity.
To illustrate the limits, consider the challenges of two important business fields: network security and fraud prevention. Fraud and intrusion are perpetually changing and never remain static. Fraudsters and hackers are criminals who continuously adjust and adapt their techniques. Controlling fraud and intrusion within a network environment requires a dynamic and continuously evolving process. A static set of rules or a machine learning model developed by learning from historical data have only short-term value.
In network security, dozens of new malware programs with ever more sophisticated methods of embedding and disguising themselves appear on the internet every day. In most cases after vulnerabilities are discovered, a patch is released to address the vulnerability. The problem is it is often easy for hackers to reverse engineer the patch and therefore another defect is found and exploited within hours of the release of the given patch. The Aurora attack, which originated in China in the Fall of 2009, against Google and several other companies was an example of exploitable dangling pointers in a Microsoft browser that previously had not been discovered.
Tools that autonomously detect new attacks against specific targets, networks or individual computers are needed. They must be able to change its parameters to thrive in new environments, learn from each individual activity, respond to various situations in different ways, and track and adapt to the specific situation/behavior of every entity of interest over time. This continuous, one-to-one behavioral analysis, provides real-time actionable insights.
Smart Agents technology
Smart Agents provides the only AI tool with the ability to overcome the limits of the legacy machine learning technologies allowing personalization, adaptability and self-learning.
Smart Agents technology is a personalization technology that creates a virtual representation of every entity and learns/builds a profile from the entity’s actions and activities. In the payment industry, for example, a Smart Agent is associated with each individual cardholder, merchant, or terminal. The Smart Agent associated to an entity (such as a card or merchant) learns in real-time from every transaction made and builds their specific and unique behaviors over time. There are as many Smart Agents as active entities in the system. For example, if there are 200 million cards transacting, there will be 200 million Smart Agents instantiated to analyze and learn the behavior of each. Decision-making is specific to each cardholder and no longer relies on logic that is universally applied to all cardholders, regardless of their individual characteristics. The Smart Agents are self-learning and adaptive since they continuously update their individual profiles from each activity and action performed by the entity. Each Smart Agent pulls all relevant data across multiple channels, irrespective of the type or format and source of the data, to produce virtual profiles.
So, for example, in a ﬁnancial portfolio management system, a multi-agent system consist of Smart Agents that cooperatively monitor and track stock quotes, ﬁnancial news, and company earnings reports to continuously monitor and make suggestions to the portfolio manager.
Each profile is automatically updated in real-time and the resulting intelligence is shared across the Smart Agents. This one-to-one behavioral profiling provides unprecedented, omni-channel visibility into the behavior of an entity.
A comprehensive intelligent solution must combine the benefits of existing artificial intelligence and machine learning techniques with the unique capabilities of Smart Agents technology. The result is a comprehensive solution that is intelligent, self-learning and adaptive.
For more detail, please download our white paper, Artificial Intelligence and Machine Learning: The Next Generation.