Big Data – Data Science Blog (English only)

Posts

Simple RNN: the first foothold for understanding LSTM

June 17, 2020/0 Comments/in Artificial Intelligence, Data Science, Deep Learning, Machine Learning, Main Category, Mathematics /by Yasuto Tamura

*In this article “Densely Connected Layers” is written as “DCL,” and “Convolutional Neural Network” as “CNN.”

In the last article, I mentioned “When it comes to the structure of RNN, many study materials try to avoid showing that RNNs are also connections of neurons, as well as DCL or CNN.” Even if you manage to understand DCL and CNN, you can be suddenly left behind once you try to understand RNN because it looks like a different field. In the second section of this article, I am going to provide a some helps for more abstract understandings of DCL/CNN , which you need when you read most other study materials.

My explanation on this simple RNN is based on a chapter in a textbook published by Massachusetts Institute of Technology, which is also recommended in some deep learning courses of Stanford University.

First of all, you should keep it in mind that simple RNN are not useful in many cases, mainly because of vanishing/exploding gradient problem, which I am going to explain in the next article. LSTM is one major type of RNN used for tackling those problems. But without clear understanding forward/back propagation of RNN, I think many people would get stuck when they try to understand how LSTM works, especially during its back propagation stage. If you have tried climbing the mountain of understanding LSTM, but found yourself having to retreat back to the foot, I suggest that you read through this article on simple RNNs. It should help you to gain a solid foothold, and you would be ready for trying to climb the mountain again.

*This article is the second article of “A gentle introduction to the tiresome part of understanding RNN.”

1, A brief review on back propagation of DCL.

Simple RNNs are straightforward applications of DCL, but if you do not even have any ideas on DCL forward/back propagation, you will not be able to understand this article. If you more or less understand how back propagation of DCL works, you can skip this first section.

Deep learning is a part of machine learning. And most importantly, whether it is classical machine learning or deep learning, adjusting parameters is what machine learning is all about. Parameters mean elements of functions except for variants. For example when you get a very simple function $f(x)=a + bx + cx^2 + dx^3$ , then $x$ is a variant, and $a, b, c, d$ are parameters. In case of classical machine learning algorithms, the number of those parameters are very limited because they were originally designed manually. Such functions for classical machine learning is useful for features found by humans, after trial and errors(feature engineering is a field of finding such effective features, manually). You adjust those parameters based on how different the outputs(estimated outcome of classification/regression) are from supervising vectors(the data prepared to show ideal answers).

In the last article I said neural networks are just mappings, whose inputs are vectors, matrices, or sequence data. In case of DCLs, inputs are vectors. Then what’s the number of parameters ? The answer depends on the the number of neurons and layers. In the example of DCL at the right side, the number of the connections of the neurons is the number of parameters(Would you like to try to count them? At least I would say “No.”). Unlike classical machine learning you no longer need to do feature engineering, but instead you need to design networks effective for each task and adjust a lot of parameters.

*I think the hype of AI comes from the fact that neural networks find features automatically. But the reality is difficulty of feature engineering was just replaced by difficulty of designing proper neural networks.

It is easy to imagine that you need an efficient way to adjust those parameters, and the method is called back propagation (or just backprop). As long as it is about DCL backprop, you can find a lot of well-made study materials on that, so I am not going to cover that topic precisely in this article series. Simply putting, during back propagation, in order to adjust parameters of a layer you need errors in the next layer. And in order calculate the errors of the next layer, you need errors in the next next layer.

*You should not think too much about what the “errors” exactly mean. Such “errors” are defined in this context, and you will see why you need them if you actually write down all the mathematical equations behind backprops of DCL.

The red arrows in the figure shows how errors of all the neurons in a layer propagate backward to a neuron in last layer. The figure shows only some sets of such errors propagating backward, but in practice you have to think about all the combinations of such red arrows in the whole back propagation(this link would give you some ideas on how DCLs work).

These points are minimum prerequisites for continuing reading this RNN this article. But if you are planning to understand RNN forward/back propagation at an abstract/mathematical level that you can read academic papers, I highly recommend you to actually write down all the equations of DCL backprop. And if possible you should try to implement backprop of three-layer DCL.

2, Forward propagation of simple RNN

In fact the simple RNN which we are going to look at in this article has only three layers. From now on imagine that inputs of RNN come from the bottom and outputs go up. But RNNs have to keep information of earlier times steps during upcoming several time steps because as I mentioned in the last article RNNs are used for sequence data, the order of whose elements is important. In order to do that, information of the neurons in the middle layer of RNN propagate forward to the middle layer itself. Therefore in one time step of forward propagation of RNN, the input at the time step propagates forward as normal DCL, and the RNN gives out an output at the time step. And information of one neuron in the middle layer propagate forward to the other neurons like yellow arrows in the figure. And the information in the next neuron propagate forward to the other neurons, and this process is repeated. This is called recurrent connections of RNN.

*To be exact we are just looking at a type of recurrent connections. For example Elman RNNs have simpler recurrent connections. And recurrent connections of LSTM are more complicated.

Whether it is a simple one or not, basically RNN repeats this process of getting an input at every time step, giving out an output, and making recurrent connections to the RNN itself. But you need to keep the values of activated neurons at every time step, so virtually you need to consider the same RNNs duplicated for several time steps like the figure below. This is the idea of unfolding RNN. Depending on contexts, the whole unfolded DCLs with recurrent connections is also called an RNN.

In many situations, RNNs are simplified as below. If you have read through this article until this point, I bet you gained some better understanding of RNNs, so you should little by little get used to this more abstract, blackboxed way of showing RNN.

You have seen that you can unfold an RNN, per time step. From now on I am going to show the simple RNN in a simpler way, based on the MIT textbook which I recomment. The figure below shows how RNN propagate forward during two time steps $(t-1), (t)$ .

The input $\boldsymbol{x}^{(t-1)}$ at time step $(t-1)$ propagate forward as a normal DCL, and gives out the output $\hat{\boldsymbol{y}} ^{(t)}$ (The notation on the $\boldsymbol{y} ^{(t)}$ is called “hat,” and it means that the value is an estimated value. Whatever machine learning tasks you work on, the outputs of the functions are just estimations of ideal outcomes. You need to adjust parameters for better estimations. You should always be careful whether it is an actual value or an estimated value in the context of machine learning or statistics). But the most important parts are the middle layers.

*To be exact I should have drawn the middle layers as connections of two layers of neurons like the figure at the right side. But I made my figure closer to the chart in the MIT textbook, and also most other study materials show the combinations of the two neurons before/after activation as one neuron.

$\boldsymbol{a}^{(t)}$ is just linear summations of $\boldsymbol{x}^{(t)}$ (If you do not know what “linear summations” mean, please scroll this page a bit), and $\boldsymbol{h}^{(t)}$ is a combination of activated values of $\boldsymbol{a}^{(t)}$ and linear summations of $\boldsymbol{h}^{(t-1)}$ from the last time step, with recurrent connections. The values of $\boldsymbol{h}^{(t)}$ propagate forward in two ways. One is normal DCL forward propagation to $\hat{\boldsymbol{y}} ^{(t)}$ and $\boldsymbol{o}^{(t)}$ , and the other is recurrent connections to $\boldsymbol{h}^{(t+1)}$ .

These are equations for each step of forward propagation.

$\boldsymbol{a}^{(t)} = \boldsymbol{b} + \boldsymbol{W} \cdot \boldsymbol{h}^{(t-1)} + \boldsymbol{U} \cdot \boldsymbol{x}^{(t)}$
$\boldsymbol{h}^{(t)}= g(\boldsymbol{a}^{(t)})$
$\boldsymbol{o}^{(t)} = \boldsymbol{c} + \boldsymbol{V} \cdot \boldsymbol{h}^{(t)}$
$\hat{\boldsymbol{y}} ^{(t)} = f(\boldsymbol{o}^{(t)})$

*Please forgive me for adding some mathematical equations on this article even though I pledged not to in the first article. You can skip the them, but for some people it is on the contrary more confusing if there are no equations. In case you are allergic to mathematics, I prescribed some treatments below.

*Linear summation is a type of weighted summation of some elements. Concretely, when you have a vector $\boldsymbol{x}=(x_0, x_1, x_2)$ , and weights $\boldsymbol{w}=(w_0,w_1, w_2)$ , then $\boldsymbol{w}^T \cdot \boldsymbol{x} = w_0 \cdot x_0 + w_1 \cdot x_1 +w_2 \cdot x_2$ is a linear summation of $\boldsymbol{x}$ , and its weights are $\boldsymbol{w}$ .

*When you see a product of a matrix and a vector, for example a product of $\boldsymbol{W}$ and $\boldsymbol{v}$ , you should clearly make an image of connections between two layers of a neural network. You can also say each element of $\boldsymbol{u}}$ is a linear summations all the elements of $\boldsymbol{v}}$ , and $\boldsymbol{W}$ gives the weights for the summations.

A very important point is that you share the same parameters, in this case $\boldsymbol{\theta \in \{\boldsymbol{U}, \boldsymbol{W}, \boldsymbol{b}, \boldsymbol{V}, \boldsymbol{c} \}}$ , at every time step.

And you are likely to see this RNN in this blackboxed form.

3, The steps of back propagation of simple RNN

In the last article, I said “I have to say backprop of RNN, especially LSTM (a useful and mainstream type or RNN), is a monster of chain rules.” I did my best to make my PowerPoint on LSTM backprop straightforward. But looking at it again, the LSTM backprop part still looks like an electronic circuit, and it requires some patience from you to understand it. If you want to understand LSTM at a more mathematical level, understanding the flow of simple RNN backprop is indispensable, so I would like you to be patient while understanding this step (and you have to be even more patient while understanding LSTM backprop).

This might be a matter of my literacy, but explanations on RNN backprop are very frustrating for me in the points below.

Most explanations just show how to calculate gradients at each time step.
Most study materials are visually very poor.
Most explanations just emphasize that “errors are back propagating through time,” using tons of arrows, but they lack concrete instructions on how actually you renew parameters with those errors.

If you can relate to the feelings I mentioned above, the instructions from now on could somewhat help you. And I am going to share some study materials on simple RNNs in an external link so that you can gain a clear and mathematical understanding on how simple RNNs work.

Backprop of RNN , as long as you are thinking about simple RNNs, is not so different from that of DCLs. But you have to be careful about the meaning of errors in the context of RNN backprop. Back propagation through time (BPTT) is one of the major methods for RNN backprop, and I am sure most textbooks explain BPTT. But most study materials just emphasize that you need errors from all the time steps, and I think that is very misleading and confusing.

You need all the gradients to adjust parameters, but you do not necessarily need all the errors to calculate those gradients. Gradients in the context of machine learning mean partial derivatives of error functions (in this case $J$ ) with respect to certain parameters, and mathematically a gradient of $J$ with respect to $\boldsymbol{\theta \in \{\boldsymbol{U}, \boldsymbol{W}, \boldsymbol{b}^{(t)}, \boldsymbol{V}, \boldsymbol{c} \}}$ is denoted as $( \frac{\partial J}{\partial \boldsymbol{\theta}} )$ . And another confusing point in many textbooks, including the MIT one, is that they give an impression that parameters depend on time steps. For example some study materials use notations like $\frac{\partial J}{\partial \boldsymbol{\theta}^{(t)}}$ , and I think this gives an impression that this is a gradient with respect to the parameters at time step $(t)$ . In my opinion this gradient rather should be written as $( \frac{\partial J}{\partial \boldsymbol{\theta}} )^{(t)}$ . But many study materials denote gradients of those errors in the former way, so from now on let me use the notations which you can see in the figures in this article.

In order to calculate the gradient $\frac{\partial J}{\partial \boldsymbol{x}^{(t)}}$ you need errors from time steps $s (s \geq t) \quad$ (as you can see in the figure, in order to calculate a gradient in a colored frame, you need all the errors in the same color).

*Another confusing point is that the $\frac{\partial J}{\partial \boldsymbol{\ast ^{(t)}}}, \boldsymbol{\ast} \in \{\boldsymbol{a}^{(t)}, \boldsymbol{h}^{(t)}, \boldsymbol{o}^{(t)}, \dots \}$ are correct notations, because $\boldsymbol{\ast}$ are values of neurons after forward propagation. They depend on time steps, and these are very values which I have been calling “errors.” That is why parameters do not depend on time steps, whereas errors depend on time steps.

As I mentioned before, you share the same parameters at every time step. Again, please do not assume that parameters are different from time step to time step. It is gradients/errors (you need errors to calculate gradients) which depend on time step. And after calculating errors at every time step, you can finally adjust parameters one time, and that’s why this is called “back propagation through time.” (It is easy to imagine that this method can be very inefficient. If the input is the whole text on a Wikipedia link, you need to input all the sentences in the Wikipedia text to renew parameters one time. To solve this problem there is a backprop method named “truncated BPTT,” with which you renew parameters based on a part of a text. )

And after calculating those gradients $\frac{\partial J}{\partial \boldsymbol{\theta}^{(t)}}$ you can take a summation of them: $\frac{\partial J}{\partial \boldsymbol{\theta}}=\sum_{t=0}^{t=\tau}{\frac{\partial J}{\partial \boldsymbol{\theta}^{(t)}}}$ . With this gradient $\frac{\partial J}{\partial \boldsymbol{\theta}}$ , you can finally renew the value of $\boldsymbol{\theta}$ one time.

At the beginning of this article I mentioned that simple RNNs are no longer for practical uses, and that comes from exploding/vanishing problem of RNN. This problem was one of the reasons for the AI winter which lasted for some 20 years. In the next article I am going to write about LSTM, a fancier type of RNN, in the context of a history of neural network history.

* I make study materials on machine learning, sponsored by DATANOMIQ. I do my best to make my content as straightforward but as precise as possible. I include all of my reference sources. If you notice any mistakes in my materials, including grammatical errors, please let me know (email: yasuto.tamura@datanomiq.de). And if you have any advice for making my materials more understandable to learners, I would appreciate hearing it.

Data Analytics & Artificial Intelligence Trends in 2020

May 13, 2020/0 Comments/in Data Science Hack, Data Science News, Insights /by Aran Davies

Artificial intelligence has infiltrated all aspects of our lives and brought significant improvements.

Although the first thing that comes to most people’s minds when they think about AI are humanoid robots or intelligent machines from sci-fi flicks, this technology has had the most impressive advancements in the field of data science.

Big data analytics is what has already transformed the way we do business as it provides an unprecedented insight into a vast amount of unstructured, semi-structured, and structured data by analyzing, processing, and interpreting it.

Data and AI specialists and researchers are likely to have a field day in 2020, so here are some of the most important trends in this industry.

1. Predictive Analytics

As its name suggests, this trend will be all about using gargantuan data sets in order to predict outcomes and results.

This practice is slated to become one of the biggest trends in 2020 because it will help businesses improve their processes tremendously. It will find its place in optimizing customer support, pricing, supply chain, recruitment, and retail sales, to name just a few.

For example, Amazon has already been leveraging predictive analytics for its dynamic pricing model. Namely, the online retail giant uses this technology to analyze the demand for a particular product, competitors’ prices, and a number of other parameters in order to adjust its price.

According to stats, Amazon changes prices 2.5 million times a day so that a particular product’s cost fluctuates and changes every 10 minutes, which requires an extremely predictive analytics algorithm.

2. Improved Cybersecurity

In a world of advanced technologies where IoT and remotely controlled devices having top-notch protection is of critical importance.

Numerous businesses and individuals have fallen victim to ruthless criminals who can steal sensitive data or wipe out entire bank accounts. Even some big and powerful companies suffered huge financial and reputation blows due to cyber attacks they were subjected to.

This kind of crime is particularly harsh for small and medium businesses. Stats say that 60% of SMBs are forced to close down after being hit by such an attack.

AI again takes advantage of its immense potential for analyzing and processing data from different sources quickly and accurately. That’s why it’s capable of assisting cybersecurity specialists in predicting and preventing attacks.

In case that an attack emerges, the response time is significantly shorter, so that the worst-case scenario can be avoided.

When we’re talking about avoiding security risks, AI can improve enterprise risk management, too, by providing guidance and assisting risk management professionals.

3. Digital Workers

In 2020, an army of digital workers will transform the traditional workspace and take productivity to a whole new level.

Virtual assistants and chatbots are some examples of already existing digital workers, but it will be even more of them. According to research, this trend is one the rise, as it’s expected that AI software and robots will increase by 50% by 2022.

Robots will take over even some small tasks in the office. The point is to streamline the entire business process, and that can be achieved by training robots to perform small and simple tasks like human employees. The only difference will be that digital workers will do that faster and without any mistakes.

4. Hybrid Workforce

Many people worry that AI and automation will steal their jobs and render them unemployed.

Even the stats are bleak – AI will eliminate 1.8 million jobs. But, on the other hand, it will create 2.3 million new jobs.

So, our future is actually AI and humans working together, and that’s what will become the business normalcy in 2020.

Robotic process automation and different office digital workers will be in charge of tedious and repetitive tasks, while more sophisticated issues that require critical thinking and creativity will be human workers’ responsibility.

One of the most important things about creating this hybrid workforce is for businesses to openly discuss it with their employees and explain how these new technologies will be used. A regular workforce has to know that they will be working alongside machines whose job will be to speed up the processes and cut costs.

5. Process Intelligence

This AI trend will allow businesses to gain insight into their processes by using all the information contained in their system and creating an overall, real-time, and accurate visual model of all the processes.

What’s great about it is that it’s possible to see these processes from different perspectives – across departments, functions, staff, and locations.

With such a visual model, it’s possible to properly analyze these processes, identify potential bottlenecks, and eliminate them before they even begin to emerge.

Besides, as this is AI and data analytics at their best, this technology will also facilitate decision-making by predicting the future results of tech investments.

Needless to say, Process Intelligence will become an enterprise standard very soon, thanks to its ability to provide a better understanding and effective management of end-to-end processes.

As you can see, in 2020, these two advanced technologies will continue to evolve and transform the business landscape and change it for the better.

Glorious career paths of a Big Data Professional

October 18, 2019/0 Comments/in Big Data, Business Analytics, Data Engineering, Data Security /by Digvijay Upadhyay

Are you wondering about the career profiles you may get to fill if you get into Big Data industry? If yes, then Bingo! This is the post that will inform you just about that. Big data is just an umbrella term. There are a lot of profiles and career paths that are covered under this umbrella term. Let us have a look at some of these profiles.

Data Visualisation Specialist

The process of visualizing data is turning out to be critical in guaranteeing information-driven representatives get the upfront investment required to actualize goal-oriented and significant Big Data extends in their organization. Making your data to tell a story and the craft of envisioning information convincingly has turned into a significant piece of the Big Data world and progressively associations need to have these capacities in-house. Besides, as a rule, these experts are relied upon to realize how to picture in different instruments, for example, Spotfire, D3, Carto, and Tableau – among numerous others. Information Visualization Specialists should be versatile and inquisitive to guarantee they stay aware of most recent patterns and answers for a recount to their information stories in the most intriguing manner conceivable with regards to the board room.

Big Data Architect

This is the place the Hadoop specialists come in. Ordinarily, a Big Data planner tends to explicit information issues and necessities, having the option to portray the structure and conduct of a Big Data arrangement utilizing the innovation wherein they practice – which is, as a rule, mostly Hadoop.

These representatives go about as a significant connection between the association (and its specific needs) and Data Scientists and Engineers. Any organization that needs to assemble a Big Data condition will require a Big Data modeler who can serenely deal with the total lifecycle of a Hadoop arrangement – including necessity investigation, stage determination, specialized engineering structure, application plan, and advancement, testing the much-dreaded task of deploying lastly.

Systems Architect

This Big data professional is in charge of how your enormous information frameworks are architected and interconnected. Their essential incentive to your group lies in their capacity to use their product building foundation and involvement with huge scale circulated handling frameworks to deal with your innovation decisions and execution forms. You’ll need this individual to construct an information design that lines up with the business, alongside abnormal state anticipating the improvement. The person in question will consider different limitations, adherence to gauges, and varying needs over the business.

Here are some responsibilities that they play:

- Determine auxiliary prerequisites of databases by investigating customer tasks, applications, and programming; audit targets with customers and assess current frameworks.
- Develop database arrangements by planning proposed framework; characterize physical database structure and utilitarian abilities, security, back-up and recuperation particulars.
- Install database frameworks by creating flowcharts; apply ideal access methods, arrange establishment activities, and record activities.
- Maintain database execution by distinguishing and settling generation and application advancement issues, figuring ideal qualities for parameters; assessing, incorporating, and putting in new discharges, finishing support and responding to client questions.
- Provide database support by coding utilities, reacting to client questions, and settling issues.

Artificial Intelligence Developer

The certain promotion around Artificial Intelligence is additionally set to quicken the number of jobs publicized for masters who truly see how to apply AI, Machine Learning, and Deep Learning strategies in the business world. Selection representatives will request designers with broad learning of a wide exhibit of programming dialects which loan well to AI improvement, for example, Lisp, Prolog, C/C++, Java, and Python.

All said and done; many people estimate that this popular demand for AI specialists could cause a something like what we call a “Brain Drain” organizations poaching talented individuals away from the universe of the scholarly world. A month ago in the Financial Times, profound learning pioneer and specialist Yoshua Bengio, of the University of Montreal expressed: “The industry has been selecting a ton of ability — so now there’s a lack in the scholarly world, which is fine for those organizations. However, it’s not extraordinary for the scholarly world.” It ; howeverusiasm to perceive how this contention among the scholarly world and business is rotated in the following couple of years.

Data Scientist

The move of Big Data from tech publicity to business reality may have quickened, yet the move away from enrolling top Data Scientists isn’t set to change in 2020. An ongoing Deloitte report featured that the universe of business will require three million Data Scientists by 2021, so if their expectations are right, there’s a major ability hole in the market. This multidisciplinary profile requires specialized logical aptitudes, specialized software engineering abilities just as solid gentler abilities, for example, correspondence, business keenness, and scholarly interest.

Data Engineer

Clean and quality data is crucial in the accomplishment of Big Data ventures. Consequently, we hope to see a lot of opening in 2020 for Data Engineers who have a predictable and awesome way to deal with information transformation and treatment. Organizations will search for these special data masters to have broad involvement in controlling data with SQL, T-SQL, R, Hadoop, Hive, Python and Spark. Much like Data Scientists. They are likewise expected to be innovative with regards to contrasting information with clashing information types with have the option to determine issues. They additionally frequently need to make arrangements which enable organizations to catch existing information in increasingly usable information groups – just as performing information demonstrations and their modeling.

IT/Operations Manager Job Description

In Big data industry, the IT/Operations Manager is a profitable expansion to your group and will essentially be in charge of sending, overseeing, and checking your enormous information frameworks. You’ll depend on this colleague to plan and execute new hardware and administrations. The person in question will work with business partners to comprehend the best innovation ventures to address their procedures and concerns—interpreting business necessities to innovation plans. They’ll likewise work with venture chiefs to actualize innovation and be in charge of effective progress and general activities.

Here are some responsibilities that they play:

Manage and be proactive in announcing, settling and raising issues where required
Lead and co-ordinate issue the executive’s exercises, notwithstanding ceaseless procedure improvement activities
Proactively deal with our IT framework
Supervise and oversee IT staffing, including enrollment, supervision, planning, advancement, and assessment
Verify existing business apparatuses and procedures remain ideally practical and worth included
Benchmark, dissect, report on and make suggestions for the improvement and development of the IT framework and IT frameworks
Advance and keep up a corporate SLA structure

Conclusion

These are some of the best career paths that big data professionals can play after entering the industry. Honesty and hard work can always take you to the zenith of any field that you choose to be in. Also, keep upgrading your skills by taking newer certifications and technologies. Good Luck

6 Important Reasons for the Java Experts to learn Hadoop Skills

September 24, 2019/0 Comments/in Big Data, Data Engineering, Insights, Main Category /by Digvijay Upadhyay

You must be well aware of the fact that Java and Hadoop Skills are in high demand these days. Gone are the days when advancement work moved around Java and social database. Today organizations are managing big information. It is genuinely big. From gigabytes to petabytes in size and social databases are exceptionally restricted to store it. Additionally, organizations are progressively outsourcing the Java development jobs to different groups who are as of now having big data experts.

Ever wondered what your future would have in store for you if you possess Hadoop as well as Java skills? No? Let us take a look. Today we shall discuss the point that why is it preferable for Java Developers to learn Hadoop.

Hadoop is the Future Java-based Framework that Leads the Industry

Data analysis is the current marketing strategy that the companies are adopting these days. What’s more, Hadoop is to process and comprehend all the Big Data that is generated all the time. As a rule, Hadoop is broadly utilized by practically all organizations from big and small and in practically all business spaces. It is an open-source stage where Java owes a noteworthy segment of its success

The processing channel of Hadoop, which is MapReduce, is written in Java. Thus, a Hadoop engineer needs to compose MapReduce contents in Java for Big data analysis. Notwithstanding that, HDFS, which is the record arrangement of Hadoop, is additionally Java-based programming language at its core. Along these lines, a Hadoop developer needs to compose documents from local framework to HDFS through deployment, which likewise includes Java programming.

Learn Hadoop: It is More Comfortable for a Java Developer

Hadoop is more of an environment than a standalone innovation. Also, Hadoop is a Java-based innovation. Regardless of whether it is Hadoop 1 which was about HDFS and MapReduce or Hadoop2 biological system that spreads HDFS, Spark, Yarn, MapReduce, Tez, Flink, Giraph, Storm, JVM is the base for all. Indeed, even a portion of the broadly utilized programming languages utilized in a portion of the Hadoop biological system segments like Spark is JVM based. The run of the mill models is Scala and Clojure.

Consequently, if you have a Java foundation, understanding Hadoop is progressively easier for you. Also, here, a Hadoop engineer needs Java programming information to work in MapReduce or Spark structure. Thus, if you are as of now a Java designer with a logical twist of the brain, you are one stage ahead to turn into a Hadoop developer.

IT Industry is looking for Professionals with Java and Hadoop Skills

If you pursue the expected set of responsibilities and range of abilities required for a Hadoop designer in places of work, wherever you will watch the reference of Java. As Hadoop needs solid Java foundation, from this time forward associations are searching for Java designers as the best substitution for Hadoop engineers. It is savvy asset usage for organizations as they don’t have to prepare Java for new recruits to learn Hadoop for tasks.

Nonetheless, the accessible market asset for Hadoop is less. Therefore, there is a noteworthy possibility for Java designers in the Hadoop occupation field. Henceforth, as a Java designer, on the off chance that you are not yet arrived up in your fantasy organization, learning Hadoop, will without a doubt help you to discover the chance to one of your top picks.

Combined Java and Hadoop Skills Means Better Pay Packages

You will be progressively keen on learning Hadoop on the off chance that you investigate Gartner report on big information industry. According to the report, the Big Data industry has just come to the $50 billion points. Additionally, over 64% of the main 720 organizations worldwide are prepared to put resources into big information innovation. Notwithstanding that when you are a mix of a Java and Hadoop engineer, you can appreciate 250% pay climb with a normal yearly compensation of $150,000.It is about the yearly pay of a senior Hadoop developer.

Besides, when you change to Big Data Hadoop, it very well may be useful to improve the nature of work. You will manage unpredictable and greater tasks. It does not just give you a better extension to demonstrate your expertise yet, in addition, to set up yourself as a profitable asset who can have any kind of effect.

Adapting Big Data Hadoop can be exceptionally advantageous because it will assist you in dealing with greater, complex activities a lot simpler and convey preferable yield over your associates. To be considered for examinations, you should be somebody who can have any kind of effect in the group, and that is the thing that Hadoop lets you be.

Learning Hadoop will open New Opportunities to Other Lucrative Fields

Big data is only not going to learn Hadoop. When you are in Big information space, you have sufficient chance to jump other Java and Hadoop engineer. There are different exceedingly requesting zones in big information like Artificial Intelligence, Machine Learning, Data Science. You can utilize your Java and Hadoop engineer expertise as a springboard to take your vocation to the following level. In any case, the move will give you the best outcome once you move from Java to Hadoop and increase fundamental working knowledge.

Java with Hadoop opens new skylines of occupation jobs, for example, data scientist, data analyst business intelligence analyst, DBA, etc.

Premier organizations prefer Hadoop Developers with Java skills

Throughout the years the Internet has been the greatest driver of information, and the new data produced in 2012 remained at 2500 Exabyte. The computerized world developed by 62% a year ago to 800K petabytes and will keep on developing to the tune of 1.2 zeta bytes during the present year. Gartner gauges the market of Hadoop Ecosystem to $77 million and predicts it will come to the $813 million marks by 2016.

A review of LinkedIn profiles referencing Hadoop as their abilities uncovered that just about 17000 individuals are working in Companies like Cisco, HP, TCS, Oracle, Amazon, Yahoo, and Facebook, and so on. Aside from this Java proficient who learn Hadoop can begin their vocations with numerous new businesses like Platfora, Alpine information labs, Trifacta, Datatorrent, and so forth.

Conclusion

You can see that combining your Java skills with Hadoop skills can open the doors of several new opportunities for you. You can get better remuneration for your efforts, and you will always be in high demand. It is high time to learn Hadoop online now if you are a java developer.

The Future of AI in Dental Technology

August 21, 2019/0 Comments/in Artificial Intelligence, Big Data, Interviews /by Haroon Bhutta

As we develop more advanced technology, we begin to learn that artificial intelligence can have more and more of an impact on our lives and industries that we have gotten used to being the same over the past decades. One of those industries is dentistry. In your lifetime, you’ve probably not seen many changes in technology, but a boom around artificial intelligence and technology has opened the door for AI in dental technologies.

How Can AI Help?

Though dentists take a lot of pride in their craft and career, most acknowledge that AI can do some things that they can’t do or would make their job easier if they didn’t have to do. AI can perform a number of both simple and advanced tasks. Let’s take a look at some areas that many in the dental industry feel that AI can be of assistance.

Repetitive, Menial Tasks

The most obvious area that AI can help out when it comes to dentistry is with repetitive and menial simple tasks. There are many administrative tasks in the dentistry industry that can be sped up and made more cost-effective with the use of AI. If we can train a computer to do some of these tasks, we may be able to free up more time for our dentists to focus on more important matters and improve their job performance as well. One primary use of AI is virtual consultations that offices like Philly Braces are offering. This saves patients time when they come in as the Doctor already knows what the next steps in their treatment will be.

Using AI to do some basic computer tasks is already being done on a small scale by some, but we have yet to see a very large scale implementation of this technology. We would expect that to happen soon, with how promising and cost-effective the technology has proven to be.

Reducing Misdiagnosis

One area that many think that AI can help a lot in is misdiagnosis. Though dentists do their best, there is still a nearly 20% misdiagnosis rate when reading x-rays in dentistry. We like to think that a human can read an x-ray better, but this may not be the case. AI technology can certainly be trained to read an x-ray and there have been some trials to suggest that they can do it better and identify key conditions that we often misread.

A world with AI diagnosis that is accurate and quicker will save time, money, and lead to better dental health among patients. It hasn’t yet come to fruition, but this seems to be the next major step for AI in dentistry.

Artificial Intelligence Assistants

Once it has been demonstrated that AI can perform a range of tasks that are useful to dentists, the next logical step is to combine those skills to make a fully-functional AI dental assistant. A machine like this has not yet been developed, but we can imagine that it would be an interface that could be spoken to similar to Alexa. The dentist would request vital information and other health history data from a patient or set of patients to assist in the treatment process. This would undoubtedly be a huge step forward and bring a lot of computing power into the average dentist office.

Conclusion

It’s clear that AI has a bright future in the dental industry and has already shown some of the essential skills that it can help with in order to provide more comprehensive and accurate care to dental patients. Some offices like Westwood Orthodontics already use AI in the form of a virtual consult to diagnose issues and provide treatment options before patients actually step foot in the office. Though not nearly all applications that AI can provide have been explored, we are well on our way to discovering the vast benefits of artificial intelligence for both patients and practices in the dental healthcare industry.

Lisa Gao, DDS, MS | Westwood Orthodontics 1033 Gayley Ave #106, Los Angeles California 90024, 310-870-1823

The New Age of Big Data: Is It the Death of Hadoop?

July 22, 2019/1 Comment/in Big Data, Main Category /by Edward Huskin

Big Data had gone through several transformations through the years, growing into the phrase we identify it as today. From its first identified use on the back of Hadoop and MapReduce, a new age of Big Data has been ushered in with the spread of new technologies such as Kubernetes, Spark, and NoSQL databases.

These might not serve the exact same purpose as Hadoop individually, but they fill the same niche and do the same job with features the original platform designers never envisioned.

The multi-cloud architecture boom and increasing emphasis on real-time data may just mean the end of Big Data as we know it, and Hadoop with it.

A brief history of Big Data

The use of data for making business decisions can be traced back to ancient civilizations in Mesopotamia. However, the age of Big Data as we know it is only as old as 2005 when O’Reilly Media launched the phrase. It was used to describe the massive amounts of data that the world was beginning to produce on the internet.

The newly-dubbed Web 2.0 needed to be indexed and easily searchable, and, Yahoo, being the behemoth that it was, was just the right company for the job. Hadoop was born off the efforts of Yahoo engineers, depending on Google’s MapReduce under the hood. A new era of Big Data had begun, and Hadoop was at the forefront of the revolution.

The new technologies led to a fundamental shift in the way the world regarded data processing. Traditional assumptions of atomicity, consistency, isolation, and durability (ACID) began to fade, and new use cases for previously unusable data began to emerge.

Hadoop would begin its life as a commercial platform with the launch of Cloudera in 2008, followed by rivals such as Hortonworks, EMC and MapR. It continued its momentous run until it seemingly hit its peak in 2015, and its place in the enterprise market would never be guaranteed again

Where Hadoop Couldn’t Keep Up

Hadoop made its mark in the world of Big Data by being a platform to collect, store and analyze large swathes of data. However, not even a technology as revolutionary and versatile as Hadoop could exist without its drawbacks.

Some of these would be so costly developers would rather design whole new systems to deal with them. With time, Hadoop started to lose its charm, unable to grow past its initial vision as a Big Data software.

Hadoop is a machine made up of smaller moving parts that are incredibly efficient at what they do – crunch data. This ultimately results in one of the first drawbacks of Hadoop – it does not come with built-in support for analytics data. Hadoop works well to process your data, but not likely as you need – visual reports about how the data is being processed, for instance.

MapReduce was also built from the ground up to be file-intensive. This makes it a great piece of software for simple requests, but not so much for iterative data. For smaller datasets, it turns out to be a rather inefficient solution.

Another area Hadoop lands flat on its face is with regards to real-time processing and reporting. Hadoop suffers from the curse of time. It relies on technologies that even its very founders (Google in particular) no longer rely on.

With MapReduce, every time you want to analyze a modified dataset (say, after adding or deleting data), you have to stream over the whole dataset again. Thanks to this feature, Hadoop is horrible at real-time reporting – a feature that led to the creation of Percolator, MapReduce’s replacement within Google.

The emergence of better technology has also meant a rise in the number of threats to said technology and a corresponding increase in the emphasis that is placed on it.

Unfortunately, Hadoop is nowhere close to being secure. As a matter of fact, its security settings are off by default, and it has too much inertia to simply change that. To make things worse, plugging in security measures isn’t that much easier.

The Fall of Hadoop

With these and more shortcomings in the data science world, new tools such as Hive, Pig and Spark were created to work on top of Hadoop to overcome its weaknesses. But it simply couldn’t grow out of the shoes it had been made for.

The growth of NoSQL databases such as Hazelcast and MongoDB also meant that problems Hadoop was designed to support were now being solved by single players rather than the ‘all or nothing’ approach Hadoop was designed with. It wasn’t flexible enough to evolve beyond simply being a batch processing software.

Over time, new Big Data challenges began to emerge that a large monolithic software like Hadoop couldn’t deal with, either. Being primarily file-intensive, it couldn’t keep up with the variety of data sources that were now available, the lack of support for dynamic schemas, on-the-fly queries, and the rise of cloud infrastructure all caused people to seek different solutions. Hadoop had lost its grip on the enterprise world.

Businesses whose primary concern was dealing with Hadoop infrastructure like Cloudera and Hortonworks were seeing less and less adoption. This led to the eventual merger of the two companies in 2019, and the same message rang out from different corners of the world at the same time: ‘Hadoop is dead.’

Is Hadoop Really Dead?

Hadoop still has a place in the enterprise world – the problems it was designed to solve still exist to this day. Technologies such as Spark have largely taken over the same space that Hadoop once occupied.

The question of Hadoop or Spark is one every data scientist has to contend with at some point, and most seem to be settling in the latter of these, thanks to the great advantages is speed it offers.

It’s unlikely Hadoop will see much more adoption with newer marker entrants, especially considering the pace with which technology moves. It also doesn’t help that a lot of alternatives have a much smaller learning curve than the convoluted monolith that is Hadoop. Companies like MapR and Cloudera have also begun to pivot away from Hadoop-only infrastructure to more robust cloud-based solutions. Hadoop still has its place, but maybe not for long.

Big Data has reduced the boundary between demand-centric dynamic pricing and user-behavior centric pricing!

December 2, 2018/0 Comments/in Big Data, Business Analytics, Data Science /by James Warner

Real-time pricing is also known as Dynamic pricing, and it is a method to plan and set highly flexible prices of the services or the products. Dynamic pricing is aimed to help the online organizations modify the costs on the fly in relation to the ever changing market conditions. All sorts of modifications are managed the costing bots, who collect the information, and use the algorithms in order to regulate the costing, keeping in mind the set guidelines. With the help of data analysis, vendors can accurately forecast the best prices, and also can adjust it as per the changing needs.

What’s the role of Big Data in Dynamics pricing?

Big data strategies are made just to get the required insights which help to enhance the performance of a business. Still, companies find it difficult to understand the capabilities of analytics, and how the analytics can be used to make the process of pricing all the more powerful. Various levels of Big Data collection, and analysis result into planning a proper dynamics pricing structure. The Big Data captured by the companies hold a lot of value when it comes to devising solid, and very workable dynamics costing structures.

Each and every one of the data-oriented firms move from the basic data reporting stage via a plenty of stages to get to the utmost, desirable level of optimization that’s deemed the most sophisticated. This eventually helps to enhance the revenue management process as well.

How Big Data lessens the gap between demand-centric dynamic pricing and user-behavior centric pricing?

Big Data as we have discussed above has a major role to play when it comes to setting dynamic pricing plans. Dynamic pricing is now further categorized into different segments and two of them are demand-centric dynamic pricing and user-behavior centric pricing. Both of these hold equal importance in creating a top pricing strategy. However, one of the other important things is that, it acts as a liaison between the two as well. It bridges the gap between the two. When it comes to demand centric costing, it is referred to as what the customer needs, and what the customer is looking for. Whereas, when it comes to user behavior pricing, it is more related to what we should be offering to the customer as per the interest levels of the customers.

Now, both of these parameters hold equal importance when it comes to making costing strategies that are fruitful. To set proper ‘demand centric pricing’ it is importance to know about the demand as well as the wants of the target audience. And, when it comes to user-behavior centric pricing, we need to know how the user is feeling, and what interest areas are. This where the role of Big Data analytics come into play.

Big Data analytics of relative information helps to find out both, the demands and well as the user behaviors. Big Data analytics done to study the target audience are a best way to get to the answers. Once we know about the demands and the user behavior we have to combine both of these to churn our better pricing strategies.

The costing plans should be taken into consideration by mapping both of these elements together. For example, even whenever we curate marketing strategies, they are basically catering to the demands of the public. But, at the same time, user-behavior is never neglected either. It’s a mix of both that we need for setting dynamic prices as well. The modifications which should be done in the pricing should be done based on collective insights gained by clubbing both the elements together.

By studying both the demands graphs as well as the user behavior reports, a company can devise plans that will turn out to be very useful when it comes to costing. Dynamic pricing is as it is a very fruitful invention, and the integration of Big Data has made it all the more powerful.

Big Data is one of those technologies which has made a lot possible in a lot of areas. Be it the pricing structures or the business strategies, Big Data analytics are used everywhere to improve the performance of the company.

The Inside Out of ML Based Prescriptive Analytics

November 17, 2018/0 Comments/in Big Data, Business Analytics, Business Intelligence, Data Mining, Data Science, Main Category, Predictive Analytics /by James Warner

With the constantly growing number of data, more and more companies are shifting towards analytic solutions. Analytic solutions help in extracting the meaning from the huge amount of data available. Thus, improving decision making.

Decision making is an important aspect of businesses, and technologies like Machine Learning are enhancing it further. The growing use of Machine Learning has changed the way of prescriptive analytics. In order to optimize the efforts, companies need to be more accurate with the historical and present data. This is because the historical and present data are the essentials of analytics. This article helps describe the inside out of Machine Learning-based prescriptive analytics.

Phases of business analytics

Descriptive analytics, predictive analytics, and prescriptive analytics are the three phases of business analytics. Descriptive analytics, being the first one, deals with past performance. Historical data is mined to understand past performance. This serves as a way to look for the reasons behind past success and failure. It is a kind of post-mortem analysis and most management reporting like sales, marketing, operations, and finance etc. make use of this.

The second one is a predictive analysis which answers the question of what is likely to happen. The historical data is now combined with rules, algorithms etc. to determine the possible future outcome or likelihood of a situation occurring.

The final phase, well known to everyone, is prescriptive analytics. It can continually take in new data and re-predict and re-prescribe. This improves the accuracy of the prediction and prescribes better decision options. Professional services or technology or their combination can be chosen to perform all the three analytics.

More about prescriptive analytics

The analysis of business activities goes through many phases. Prescriptive analytics is one such. It is known to be the third phase of business analytics and comes after descriptive and predictive analytics. It entails the application of mathematical and computational sciences. It makes use of the results obtained from descriptive and predictive analysis to suggest decision options. It goes beyond predicting future outcomes and suggests actions to benefit from the predictions. It shows the implications of each decision option. It anticipates on what will happen when it will happen as well as why it will happen.

ML-based prescriptive analytics

Being just before the prescriptive analytics, predictive analytics is often confused with it. What actually happens is predictive analysis leads to prescriptive analysis. Thus, a Machine Learning based prescriptive analytics goes through an ML-based predictive analysis first. Therefore, it becomes necessary to consider the ML-based predictive analysis first.

ML-based predictive analytics:

A lot of things prevent businesses from achieving predictive analysis capabilities. Machine Learning can be a great help in boosting Predictive analytics. Use of Machine Learning and Artificial Intelligence algorithms helps businesses in optimizing and uncovering the new statistical patterns. These statistical patterns form the backbone of predictive analysis. E-commerce, marketing, customer service, medical diagnosis etc. are some of the prospective use cases for Machine Learning based predictive analytics.

In E-commerce, machine learning can help in predicting the usual choices of the customer. Thus, presenting him/her according to his/her likes and dislikes. It can also help in predicting fraudulent transaction. Similarly, B2B marketing also makes good use of Machine learning based predictive analytics. Customer services and medical diagnosis also benefit from predictive analytics. Thus, a prediction and a prescription based on machine learning can boost various business functions.

Organizations and software development companies are making more and more use of machine learning based predictive analytics. The advancements like neural networks and deep learning algorithms are able to uncover hidden information. This all requires a well-researched approach. Big data and progressive IT systems also act as important factors in this.

Deep Learning and Human Intelligence – Part 1 of 2

July 5, 2018/0 Comments/in Artificial Intelligence, Data Science, Deep Learning, Gerneral, Machine Learning, Predictive Analytics /by Valentin Curtef

Many people are under the impression that the new wave of data science, machine learning and/or digitalization is new, that it did not exist before. But its history is as long as the history of humanity and/or science itself. The scientific discovery could hardly take place without the necessary data. Even the process of discovering the numbers included elements of machine learning: pattern recognition, comparison between different groups (ranking), clustering, etc. So what differentiates mathematical formulas from machine learning and how does it relate to artificial intelligence?

There is no difference between the two if seen from the perspective of formulas however, such a perspective limits the type of data to which they can be applied. Data stored via tables consist of structured data and are stored in so-called relational databases. The reason for such a data storage is the connection between different fields that assume a well-established structure in advance, such as a company’s sales or balance sheet. However, with the emergence of personal computers, many of the daily activities have been digitalized: music, pictures, movies, and so on. All this information is stored unrelated to other data and therefore called unstructured data.

IEEE International Conference on Computer Vision (ICCV), 2015, DOI: 10.1109/ICCV.2015.428

The essence of scientific discoveries was and will be structure. Not surprisingly, the mathematical formulas revolve around relations between variables – information, in general. For example, Galileo derived the law of falling balls from measuring the successive hight of a falling ball. The main difficulty was to obtain measurements at regular time intervals. What about if the data is not structured, which mathematical formula should be applied then? There is a distribution of people’s height, but no distribution for the pictures taken in all holidays for the last year, there is an amplitude for acoustic signals, but no function that detects the similarity between two songs. This is one of the reasons why machine learning focuses heavily on clustering and classification.

Roughly speaking, these simple examples are enough to categorize the difference between scientific discovery and machine learning. Science is about discovering relationships between different variables, Machine Learning tries to automatize processes. Every technical improvement is part of the automation, so why is everything different in this case? Because the current automation deals with human intelligence. The car automates the walking, the kitchen stove the fire, but Machine Learning parts of the human intelligence. There is a difference between the previous automation steps and those of human intelligence. All the previous ones are either outside the human body – such as Fire – or unconsciously executed (once learned) – walking, spinning, etc. The automation induced by Machine Learning affects a part of the human intelligence that we consciously perceive. Of course, today’s machine learning tools are unable to automate all human intelligence, but it is a fascinating step in that direction.

A breakthrough in Machine Learning tasks was achieved in 2012 when the first Deep Learning algorithm for detecting types of images, reached near-human accuracy. It could appreciate the likelihood that the image is a human face, a train, a ball or a fish without having “seen” the picture before. Such an algorithm can be used in various areas: personally – facial recognition in pictures and/or social media – as tagging of images or videos, medicine – cancer detection, etc. For understanding such cutting-edge issues of classification, one cannot avoid understanding how Deep Learning works. To see the beauty of such algorithms and, at the same time, to be able to comprehend the difficulty of working with them, an example will be the best guide.

The building blocks of Deep Learning are neurons, operational units, which perform mathematical operations or logical operations like AND, OR, etc., and are modelled after the neurons in the brain. Already in the 1950’s two neuroscientist, Hubel and Wiesel, observed that not all neurons in the brain are responding in the same fashion to visual stimuli. Some responded only to horizontal lines, whereas others to vertical lines, with other words, the brain is constructed with specialized neurons. Groups of such neurons are called, in the Machine Learning community, layers. Like in the brain, neurons with different properties are clustered in different layers. This implies that layers have also specific properties and have to be arranged in a specific way, called architecture. It is this architecture which differentiates Deep Learning from Artificial Neuronal Networks (ANN are similar to a layer).

Unfortunately, scientists still haven’t figured out how the brain works, thus to discover how to train Deep Learning from data was not an easy task, and is also the reason why another example is used to explain the training of Deep Learning: the eye. One has always to remember: once it is known how Deep Learning works, it is simple to find example which illustrates the working mechanism. For such an analogy, it is sufficient for someone without any knowledge about Deep Learning, to keep in mind only the elements that compose such architectures: input data, different layers of neurons, output layers, ReLu’s.

Input data are any type of information, in our example it is light. Of course, that Deep Learning is not limited only to images or videos, but also to sound and/or time series, which would imply that the example would be the ear and sound waves, or the brain and numbers.

Layers can be seen as cells in the eye. It is well known that the eye is formed of different layers connected to each other with each of them having different properties, functionalities. The same is true also for the layers of a Deep Learning architecture: one can see the neurons as cells of the layer as the tissue. While, mathematically, the neurons are nothing more than simple operations, usually linear weight functions, they can be seen as the properties of individual cells. Each layer has one weight matrix, which gives the neuron (and layer) specific properties depending on the data and the task at hand.

It is here that the architecture becomes very important. What Deep Learning offers is a default setting of the layers with unknown weights. One can see this as trying to build an eye knowing that there are different types of cells and different ways how tissues of such cells can be arranged, but not which cell exactly is needed (with what properties) and which arrangement of layers works best. Such an approach has the advantage that one is capable of building any type of organ desired, but the disadvantage is also very obvious: it is time consuming to find the appropriate cell properties and layers arrangements.

Still, the strategy of Deep Learning is a significant departure from the Machine Learning approaches. The performance of Machine Learning methods is as good as the features engineering performed by Data Scientists, and thus depending on the creativity of the Data Scientist. In the case of Deep Learning the engineers of the features is performed automatically as part of the model building. This is a huge improvement, as the only difficult task is to have enough data and computer power to find the right weights matrices. Such an endeavor was performed also by nature for the eye — and is also the reason why one can choose it as an example for Deep Learning — evolution. It is not surprising that Deep Learning is one of the best direction scientists have of Artificial Intelligence today.

The evolution of the eye can be seen, from the perspective of Data Scientists, as the continuous training of a Deep Learning architecture which enables to recognize and track one or more objects. The performance of the evolutional process can be summed up as the fine tuning of the cells which are getting more and more susceptible to light and the adaptation of layers to enable a better vision. Different animals in different environments and different targets — as the hawk and the fly — developed different eyes than humans, but they all work according to the same principle. The tasks that Deep Learning is performing today are similar, for example it can be used to drive cars but there is still a difference: there is no connection to other organs. Deep Learning is not the approximation of an Artificial Organism, like an android, but a simplified Artificial Organ that can work on its own.

Returning to the working mechanism of the Deep Learning architecture, we can already follow the analogy of what happens if a ray of light is hitting the eye. Once the eye is fully adapted to the task, one can followed how the information enters the Deep Learning architecture (Artificial Eye) by penetrating the input layer. already here arises the question, what kind of eye is the best? One where a small source of light can reach as many neurons as possible, or the one where the light sources reaches only few neurons? In order to take such a decision, a last piece of the puzzle is required: ReLu. One can see them as synapses between neurons (cells) and/or similarly for tissue. By using continuous functions, such as the shape of the latter ‘S’ (called sigmoid), the information from one neuron will be distributed over a large number of other neurons. If one uses the maximum function, then only few neurons are updated with processed information from earlier layers.

Such sparse structures between neurons, was a major improvement in the development of the technique of training Deep Learning architectures. Again, it has a strong evolutionary analogy: energy efficiency. By needing less neurons, the tissues and architecture are both kept to a minimal size which enables flexibility in development and less energy. As the information is process by the different layers, the Artificial Eye is gathering more and more complex (non-linear) structures — the adapted features –, which help to decide, from past experience, what kind of object is detected.

This was part 1 of 2 of the article series. Continue with Part 2.

Artificial Intelligence and Data Science in the Automotive Industry

May 6, 2017/1 Comment/in Artificial Intelligence, Big Data, Business Analytics, Business Intelligence, Cloud, Connected Car, Data Science, Machine Learning, Main Category, Manufacturing, Projectmanagement, Use Case, Use Cases /by Volkswagen

Data science and machine learning are the key technologies when it comes to the processes and products with automatic learning and optimization to be used in the automotive industry of the future. This article defines the terms “data science” (also referred to as “data analytics”) and “machine learning” and how they are related. In addition, it defines the term “optimizing analytics“ and illustrates the role of automatic optimization as a key technology in combination with data analytics. It also uses examples to explain the way that these technologies are currently being used in the automotive industry on the basis of the major subprocesses in the automotive value chain (development, procurement; logistics, production, marketing, sales and after-sales, connected customer). Since the industry is just starting to explore the broad range of potential uses for these technologies, visionary application examples are used to illustrate the revolutionary possibilities that they offer. Finally, the article demonstrates how these technologies can make the automotive industry more efficient and enhance its customer focus throughout all its operations and activities, extending from the product and its development process to the customers and their connection to the product.

Read this article in German:
“Künstliche Intelligenz und Data Science in der Automobilindustrie“

1 Introduction

2 The Data Mining Process

3 The pillars of artificial intelligence
3.1 Maschine Learning
3.2 Computer Vision
3.3 Inference and decision-making
3.4 Language and communication
3.5 Agents and Actions

4 Data mining and artificial intelligence in the automotive industry
4.1 Development
4.2 Procurement
4.3 Logistics
4.4 Production
4.5 Marketing
4.6 Sales, After Sales, and Retail
4.7 Connected Customer

5 Vision
5.1 Autonomous vehicles
5.2 Integrated factory optimization
5.3 Companies acting autonomously

6 Conclusions

7 Authors

8 Sources

Authors:

Dr. Martin Hofmann (CIO – Volkswagen AG)
Dr. Florian Neukart (Principal Data Scientist – Volkswagen AG)
Prof. Dr. Thomas Bäck (Universität Leiden)

1 Introduction

Data science and machine learning are now key technologies in our everyday lives, as we can see in a multitude of applications, such as voice recognition in vehicles and on cell phones, automatic facial and traffic sign recognition, as well as chess and, more recently, Go machine algorithms[1] which humans can no longer beat. The analysis of large data volumes based on search, pattern recognition, and learning algorithms provides insights into the behavior of processes, systems, nature, and ultimately people, opening the door to a world of fundamentally new possibilities. In fact, the now already implementable idea of autonomous driving is virtually a tangible reality for many drivers today with the help of lane keeping assistance and adaptive cruise control systems in the vehicle.

The fact that this is just the tip of the iceberg, even in the automotive industry, becomes readily apparent when one considers that, at the end of 2015, Toyota and Tesla’s founder, Elon Musk, each announced investments amounting to one billion US dollars in artificial intelligence research and development almost at the same time. The trend towards connected, autonomous, and artificially intelligent systems that continuously learn from data and are able to make optimal decisions is advancing in ways that are simply revolutionary, not to mention fundamentally important to many industries. This includes the automotive industry, one of the key industries in Germany, in which international competitiveness will be influenced by a new factor in the near future – namely the new technical and service offerings that can be provided with the help of data science and machine learning.

This article provides an overview of the corresponding methods and some current application examples in the automotive industry. It also outlines the potential applications to be expected in this industry very soon. Accordingly, sections 2 and 3 begin by addressing the subdomains of data mining (also referred to as “big data analytics”) and artificial intelligence, briefly summarizing the corresponding processes, methods, and areas of application and presenting them in context. Section 4 then provides an overview of current application examples in the automotive industry based on the stages in the industry’s value chain –from development to production and logistics through to the end customer. Based on such an example, section 5 describes the vision for future applications using three examples: one in which vehicles play the role of autonomous agents that interact with each other in cities, one that covers integrated production optimization, and one that describes companies themselves as autonomous agents.

Whether these visions will become a reality in this or any other way cannot be said with certainty at present – however, we can safely predict that the rapid rate of development in this area will lead to the creation of completely new products, processes, and services, many of which we can only imagine today. This is one of the conclusions drawn in section 6, together with an outlook regarding the potential future effects of the rapid rate of development in this area.

2 The data mining process

Gartner uses the term “prescriptive analytics“ to describe the highest level of ability to make business decisions on the basis of data-based analyses. This is illustrated by the question “what should I do?” and prescriptive analytics supplies the required decision-making support, if a person is still involved, or automation if this is no longer the case.

The levels below this, in ascending order in terms of the use and usefulness of AI and data science, are defined as follows: descriptive analytics (“what has happened?”), diagnostic analytics (“why did it happen?”), and predictive analytics (“what will happen?”) (see Figure 1). The last two levels are based on data science technologies, including data mining and statistics, while descriptive analytics essentially uses traditional business intelligence concepts (data warehouse, OLAP).

In this article, we seek to replace the term “prescriptive analytics“ with the term “optimizing analytics.“ The reason for this is that a technology can “prescribe” many things, while, in terms of implementation within a company, the goal is always to make something “better” with regard to target criteria or quality criteria. This optimization can be supported by search algorithms, such as evolutionary algorithms in nonlinear cases and operation research (OR) methods in – much rarer – linear cases. It can also be supported by application experts who take the results from the data mining process and use them to draw conclusions regarding process improvement. One good example are the decision trees learned from data, which application experts can understand, reconcile with their own expert knowledge, and then implement in an appropriate manner. Here too, the application is used for optimizing purposes, admittedly with an intermediate human step.

Within this context, another important aspect is the fact that multiple criteria required for the relevant application often need to be optimized at the same time, meaning that multi-criteria optimization methods – or, more generally, multi-criteria decision-making support methods – are necessary. These methods can then be used in order to find the best possible compromises between conflicting goals. The examples mentioned include the frequently occurring conflicts between cost and quality, risk and profit, and, in a more technical example, between the weight and passive occupant safety of a body.

Figure 1: The four levels of data analysis usage within a company

These four levels form a framework, within which it is possible to categorize data analysis competence and potential benefits for a company in general. This framework is depicted in Figure 1 and shows the four layers which build upon each other, together with the respective technology category required for implementation.

The traditional Cross-Industry Standard Process for Data Mining (CRISP-DM)[2] includes no optimization or decision-making support whatsoever. Instead, based on the business understanding, data understanding, data preparation, modeling, and evaluation sub-steps, CRISP proceeds directly to the deployment of results in business processes. Here too, we propose an additional optimization step that in turn comprises multi-criteria optimization and decision-making support. This approach is depicted schematically in Figure 2.

Figure 2: Traditional CRISP-DM process with an additional optimization step

It is important to note that the original CRISP model deals with a largely iterative approach used by data scientists to analyze data manually, which is reflected in the iterations between business understanding and data understanding as well as data preparation and modeling. However, evaluating the modeling results with the relevant application experts in the evaluation step can also result in having to start the process all over again from the business understanding sub-step, making it necessary to go through all the sub-steps again partially or completely (e.g., if additional data needs to be incorporated).

The manual, iterative procedure is also due to the fact that the basic idea behind this approach – as up-to-date as it may be for the majority of applications – is now almost 20 years old and certainly only partially compatible with a big data strategy. The fact is that, in addition to the use of nonlinear modeling methods (in contrast to the usual generalized linear models derived from statistical modeling) and knowledge extraction from data, data mining rests on the fundamental idea that models can be derived from data with the help of algorithms and that this modeling process can run automatically for the most part – because the algorithm “does the work.”

In applications where a large number of models need to be created, for example for use in making forecasts (e.g., sales forecasts for individual vehicle models and markets based on historical data), automatic modeling plays an important role. The same applies to the use of online data mining, in which, for example, forecast models (e.g., for forecasting product quality) are not only constantly used for a production process, but also adapted (i.e., retrained) continuously whenever individual process aspects change (e.g., when a new raw material batch is used). This type of application requires the technical ability to automatically generate data, and integrate and process it in such a way that data mining algorithms can be applied to it. In addition, automatic modeling and automatic optimization are necessary in order to update models and use them as a basis for generating optimal proposed actions in online applications. These actions can then be communicated to the process expert as a suggestion or – especially in the case of continuous production processes – be used directly to control the respective process. If sensor systems are also integrated directly into the production process – to collect data in real time – this results in a self-learning cyber-physical system [3] that facilitates implementation of the Industry 4.0[4] vision in the field of production engineering.

Figure 3: Architecture of an Industry 4.0 model for optimizing analytics

This approach is depicted schematically in Figure 3. Data from the system is acquired with the help of sensors and integrated into the data management system. Using this as a basis, forecast models for the system’s relevant outputs (quality, deviation from target value, process variance, etc.) are used continuously in order to forecast the system’s output. Other machine learning options can be used within this context in order, for example, to predict maintenance results (predictive maintenance) or to identify anomalies in the process. The corresponding models are monitored continuously and, if necessary, automatically retrained if any process drift is observed. Finally, the multi-criteria optimization uses the models to continuously compute optimum setpoints for the system control. Human process experts can also be integrated here by using the system as a suggestion generator so that a process expert can evaluate the generated suggestions before they are implemented in the original system.

In order to differentiate it from “traditional” data mining, the term “big data” is frequently defined now with three (sometimes even four or five) essential characteristics: volume, velocity, and variety, which refer to the large volume of data, the speed at which data is generated, and the heterogeneity of the data to be analyzed, which can no longer be categorized into the conventional relational database schema. Veracity, i.e., the fact that large uncertainties may also be hidden in the data (e.g., measurement inaccuracies), and finally value, i.e., the value that the data and its analysis represents for a company’s business processes, are often cited as additional characteristics. So it is not just the pure data volume that distinguishes previous data analytics methods from big data, but also other technical factors that require the use of new methods– such as Hadoop and MapReduce – with appropriately adapted data analysis algorithms in order to allow the data to be saved and processed. In addition, so-called “in-memory databases” now also make it possible to apply traditional learning and modeling algorithms in main memory to large data volumes.

This means that if one were to establish a hierarchy of data analysis and modeling methods and techniques, then, in very simplistic terms, statistics would be a subset of data mining, which in turn would be a subset of big data. Not every application requires the use of data mining or big data technologies. However, a clear trend can be observed, which indicates that the necessities and possibilities involved in the use of data mining and big data are growing at a very rapid pace as increasingly large data volumes are being collected and linked across all processes and departments of a company. Nevertheless, conventional hardware architecture with additional main memory is often more than sufficient for analyzing large data volumes in the gigabyte range.

Although optimizing analytics is of tremendous importance, it is also crucial to always be open to the broad variety of applications when using artificial intelligence and machine learning algorithms. The wide range of learning and search methods, with potential use in applications such as image and language recognition, knowledge learning, control and planning in areas such as production and logistics, among many others, can only be touched upon within the scope of this article.

3 The pillars of artificial intelligence

An early definition of artificial intelligence from the IEEE Neural Networks Council was “the study of how to make computers do things at which, at the moment, people are better.”[5] Although this still applies, current research is also focused on improving the way that software does things at which computers have always been better, such as analyzing large amounts of data. Data is also the basis for developing artificially intelligent software systems not only to collect information, but also to:

Learn
Understand and interpret information
Behave adaptively
Plan
Make inferences
Solve problems
Think abstractly
Understand and interpret ideas and language

3.1 Machine learning

At the most general level, machine learning (ML) algorithms can be subdivided into two categories: supervised and unsupervised, depending on whether or not the respective algorithm requires a target variable to be specified.

Supervised learning algorithms

Apart from the input variables (predictors), supervised learning algorithms also require the known target values (labels) for a problem. In order to train an ML model to identify traffic signs using cameras, images of traffic signs – preferably with a variety of configurations – are required as input variables. In this case, light conditions, angles, soiling, etc. are compiled as noise or blurring in the data; nonetheless, it must be possible to recognize a traffic sign in rainy conditions with the same accuracy as when the sun is shining. The labels, i.e., the correct designations, for such data are normally assigned manually. This correct set of input variables and their correct classification constitute a training data set. Although we only have one image per training data set in this case, we still speak of multiple input variables, since ML algorithms find relevant features in training data and learn how these features and the class assignment for the classification task indicated in the example are associated. Supervised learning is used primarily to predict numerical values (regression) and for classification purposes (predicting the appropriate class), and the corresponding data is not limited to a specific format – ML algorithms are more than capable of processing images, audio files, videos, numerical data, and text. Classification examples include object recognition (traffic signs, objects in front of a vehicle, etc.), face recognition, credit risk assessment, voice recognition, and customer churn, to name but a few.

Regression examples include determining continuous numerical values on the basis of multiple (sometimes hundreds or thousands) input variables, such as a self-driving car calculating its ideal speed on the basis of road and ambient conditions, determining a financial indicator such as gross domestic product based on a changing number of input variables (use of arable land, population education levels, industrial production, etc.), and determining potential market shares with the introduction of new models. Each of these problems is highly complex and cannot be represented by simple, linear relationships in simple equations. Or, to put it another way that more accurately represents the enormous challenge involved: the necessary expertise does not even exist.

Unsupervised learning algorithms

Unsupervised learning algorithms do not focus on individual target variables, but instead have the goal of characterizing a data set in general. Unsupervised ML algorithms are often used to group (cluster) data sets, i.e., to identify relationships between individual data points (that can consist of any number of attributes) and group them into clusters. In certain cases, the output from unsupervised ML algorithms can in turn be used as an input for supervised methods. Examples of unsupervised learning include forming customer groups based on their buying behavior or demographic data, or clustering time series in order to group millions of time series from sensors into groups that were previously not obvious.

In other words, machine learning is the area of artificial intelligence (AI) that enables computers to learn without being programmed explicitly. Machine learning focuses on developing programs that grow and change by themselves as soon as new data is provided. Accordingly, processes that can be represented in a flowchart are not suitable candidates for machine learning – in contrast, everything that requires dynamic and changing solution strategies and cannot be constrained to static rules is potentially suitable for solution with ML. For example, ML is used when:

No relevant human expertise exists
People are unable to express their expertise
The solution changes over time
The solution needs to be adapted to specific cases

In contrast to statistics, which follows the approach of making inferences based on samples, computer science is interested in developing efficient algorithms for solving optimization problems, as well as in developing a representation of the model for evaluating inferences. Methods frequently used for optimization in this context include so-called “evolutionary algorithms” (genetic algorithms, evolution strategies), the basic principles of which emulate natural evolution[6]. These methods are very efficient when applied to complex, nonlinear optimization problems.

Even though ML is used in certain data mining applications, and both look for patterns in data, ML and data mining are not the same thing. Instead of extracting data that people can understand, as is the case with data mining, ML methods are used by programs to improve their own understanding of the data provided. Software that implements ML methods recognizes patterns in data and can dynamically adjust the behavior based on them. If, for example, a self-driving car (or the software that interprets the visual signal from the corresponding camera) has been trained to initiate a braking maneuver if a pedestrian appears in front it, this must work with all pedestrians regardless of whether they are short, tall, fat, thin, clothed, coming from the left, coming from the right, etc. In turn, the vehicle must not brake if there is a stationary garbage bin on the side of the road.

The level of complexity in the real world is often greater than the level of complexity of an ML model, which is why, in most cases, an attempt is made to subdivide problems into subproblems and then apply ML models to these subproblems. The output from these models is then integrated in order to permit complex tasks, such as autonomous vehicle operation, in structured and unstructured environments.

3.2 Computer vision

Computer vision (CV) is a very wide field of research that merges scientific theories from various fields (as is often the case with AI), starting from biology, neuroscience, and psychology and extending all the way to computer science, mathematics, and physics. First, it is important to know how an image is produced physically. Before light hits sensors in a two-dimensional array, it is refracted, absorbed, scattered, or reflected, and an image is produced by measuring the intensity of the light beams through each element in the image (pixel). The three primary focuses of CV are:

Reconstructing a scene and the point from which the scene is observed based on an image, an image sequence, or a video.
Emulating biological visual perception in order to better understand which physical and biological processes are involved, how the wetware works, and how the corresponding interpretation and understanding work.
Technical research and development focuses on efficient, algorithmic solutions – when it comes to CV software, problem-specific solutions that only have limited commonalities with the visual perception of biological organisms are often developed.

All three areas overlap and influence each other. If, for example, the focus in an application is on obstacle recognition in order to initiate an automated braking maneuver in the event of a pedestrian appearing in front of the vehicle, the most important thing is to identify the pedestrian as an obstacle. Interpreting the entire scene – e.g., understanding that the vehicle is moving towards a family having a picnic in a field – is not necessary in this case. In contrast, understanding a scene is an essential prerequisite if context is a relevant input, such as is the case when developing domestic robots that need to understand that an occupant who is lying on the floor not only represents an obstacle that needs to be evaded, but is also probably not sleeping and a medical emergency is occurring.

Vision in biological organisms is regarded as an active process that includes controlling the sensor and is tightly linked to successful performance of an action[7]. Consequently,[8] CV systems are not passive either. In other words, the system must:

Be continuously provided with data via sensors (streaming)
Act based on this data stream

Having said that, the goal of CV systems is not to understand scenes in images – first and foremost, the systems must extract the relevant information for a specific task from the scene. This means that they must identify a “region of interest” that will be used for processing. Moreover, these systems must feature short response times, since it is probable that scenes will change over time and that a heavily delayed action will not achieve the desired effect. Many different methods have been proposed for object recognition purposes (“what” is located “where” in a scene), including:

Object detectors, in which case a window moves over the image and a filter response is determined for each position by comparing a template and the sub-image (window content), with each new object parameterization requiring a separate scan. More sophisticated algorithms simultaneously make calculations based on various scales and apply filters that have been learned from a large number of images.
Segment-based techniques extract a geometrical description of an object by grouping pixels that define the dimensions of an object in an image. Based on this, a fixed feature set is computed, i.e., the features in the set retain the same values even when subjected to various image transformations, such as changes in light conditions, scaling, or rotation. These features are used to clearly identify objects or object classes, one example being the aforementioned identification of traffic signs.
Alignment-based methods use parametric object models that are trained on data[9]^,[10]. Algorithms search for parameters, such as scaling, translation, or rotation, that adapt a model optimally to the corresponding features in the image, whereby an approximated solution can be found by means of a reciprocal process, i.e., by features, such as contours, corners, or others, “selecting” characteristic points in the image for parameter solutions that are compatible with the found feature.

With object recognition, it is necessary to decide whether algorithms need to process 2-D or 3-D representations of objects – 2-D representations are very frequently a good compromise between accuracy and availability. Current research (deep learning) shows that even distances between two points based on two 2-D images captured from different points can be accurately determined as an input. In daylight conditions and with reasonably good visibility, this input can be used in addition to data acquired with laser and radar equipment in order to increase accuracy – moreover, a single camera is sufficient to generate the required data. In contrast to 3-D objects, no shape, depth, or orientation information is directly encoded in 2-D images. Depth can be encoded in a variety of ways, such as with the use of laser or stereo cameras (emulating human vision) and structured light approaches (such as Kinect). At present, the most intensively pursued research direction involves the use of superquadrics – geometric shapes defined with formulas, which use any number of exponents to identify structures such as cylinders, cubes, and cones with round or sharp edges. This allows a large variety of different basic shapes to be described with a small set of parameters. If 3-D images are acquired using stereo cameras, statistical methods (such as generating a stereo point cloud) are used instead of the aforementioned shape-based methods, because the data quality achieved with stereo cameras is poorer than that achieved with laser scans.

Other research directions include tracking[11]^,[12], contextual scene understanding,[13]^,[14] and monitoring[15], although these aspects are currently of secondary importance to the automotive industry.

3.3 Inference and decision-making

This field of research, referred to in the literature as “knowledge representation & reasoning” (KRR), focuses on designing and developing data structures and inference algorithms. Problems solved by making inferences are very often found in applications that require interaction with the physical world (humans, for example), such as generating diagnostics, planning, processing natural languages, answering questions, etc. KRR forms the basis for AI at the human level.

Making inferences is the area of KRR in which data-based answers need to be found without human intervention or assistance, and for which data is normally presented in a formal system with distinct and clear semantics. Since 1980, it has been assumed that the data involved is a mixture of simple and complex structures, with the former having a low degree of computational complexity and forming the basis for research involving large databases. The latter are presented in a language with more expressive power, which requires less space for representation, and they correspond to generalizations and fine-grained information.

Decision-making is a type of inference that revolves primarily around answering questions regarding preferences between activities, for example when an autonomous agent attempts to fulfill a task for a person. Such decisions are very frequently made in a dynamic domain which changes over the course of time and when actions are executed. An example of this is a self-driving car that needs to react to changes in traffic.

Logic and combinatorics

Mathematical logic is the formal basis for many applications in the real world, including calculation theory, our legal system and corresponding arguments, and theoretical developments and evidence in the field of research and development. The initial vision was to represent every type of knowledge in the form of logic and use universal algorithms to make inferences from it, but a number of challenges arose – for example, not all types of knowledge can be represented simply. Moreover, compiling the knowledge required for complex applications can become very complex, and it is not easy to learn this type of knowledge in a logical, highly expressive language.[16] In addition, it is not easy to make inferences with the required highly expressive language – in extreme cases, such scenarios cannot be implemented computationally, even if the first two challenges are overcome. Currently, there are three ongoing debates on this subject, with the first one focusing on the argument that logic is unable to represent many concepts, such as space, analogy, shape, uncertainty, etc., and consequently cannot be included as an active part in developing AI to a human level. The counterargument states that logic is simply one of many tools. At present, the combination of representative expressiveness, flexibility, and clarity cannot be achieved with any other method or system. The second debate revolves around the argument that logic is too slow for making inferences and will therefore never play a role in a productive system. The counterargument here is that ways exist to approximate the inference process with logic, so processing is drawing close to remaining within the required time limits, and progress is being made with regard to logical inference. Finally, the third debate revolves around the argument that it is extremely difficult, or even impossible, to develop systems based on logical axioms into applications for the real world. The counterarguments in this debate are primarily based on the research of individuals currently researching techniques for learning logical axioms from natural-language texts.

In principle, a distinction is made between four different types of logic[17] which are not discussed any further in this article:

Propositional logic
First-order predicate logic
Modal logic
Non-monotonic logic

Automated decision-making, such as that found in autonomous robots (vehicles), WWW agents, and communications agents, is also worth mentioning at this point. This type of decision-making is particularly relevant when it comes to representing expert decision-making processes with logic and automating them. Very frequently, this type of decision-making process takes account of the dynamics of the surroundings, for example when a transport robot in a production plant needs to evade another transport robot. However, this is not a basic prerequisite, for example, if a decision-making process without a clearly defined direction is undertaken in future, e.g., the decision to rent a warehouse at a specific price at a specific location. Decision-making as a field of research encompasses multiple domains, such as computer science, psychology, economics, and all engineering disciplines. Several fundamental questions need to be answered to enable development of automated decision-making systems:

Is the domain dynamic to the extent that a sequence of decisions is required or static in the sense that a single decision or multiple simultaneous decisions need to be made?
Is the domain deterministic, non-deterministic, or stochastic?
Is the objective to optimize benefits or to achieve a goal?
Is the domain known to its full extent at all times? Or is it only partially known?

Logical decision-making problems are non-stochastic in nature as far as planning and conflicting behavior are concerned. Both require that the available information regarding the initial and intermediate states be complete, that actions have exclusively deterministic, known effects, and that a specific defined goal exists. These problem types are often applied in the real world, for example in robot control, logistics, complex behavior in the WWW, and in computer and network security.

In general, planning problems consist of an initial (known) situation, a defined goal, and a set of permitted actions or transitions between steps. The result of a planning process is a sequence or set of actions that, when executed correctly, change the executing entity from an initial state to a state that meets the target conditions. Computationally speaking, planning is a difficult problem, even if simple problem specification languages are used. Even when relatively simple problems are involved, the search for a plan cannot run through all state-space representations, as these are exponentially large in the number of states that define the domains. Consequently, the aim is to develop efficient algorithms that represent sub-representations in order to search through these with the hope of achieving the relevant goal. Current research is focused on developing new search methods and new representations for actions and states, which will make planning easier. Particularly when one or more agents acting against each other are taken into account, it is crucial to find a balance between learning and decision-making – exploration for the sake of learning while decisions are being made can lead to undesirable results.

Many problems in the real world are problems with dynamics of a stochastic nature. One example of this is buying a vehicle with features that affect its value, of which we are unaware. These dependencies influence the buying decision, so it is necessary to allow risks and uncertainties to be considered. For all intents and purposes, stochastic domains are more challenging when it comes to making decisions, but they are also more flexible than deterministic domains with regard to approximations – in other words, simplifying practical assumptions makes automated decision-making possible in practice. A great number of problem formulations exist, which can be used to represent various aspects and decision-making processes in stochastic domains, with the best-known being decision networks and Markov decision processes.

Many applications require a combination of logical (non-stochastic) and stochastic elements, for example when the control of robots requires high-level specifications in logic and low-level representations for a probabilistic sensor model. Processing natural languages is another area in which this assumption applies, since high-level knowledge in logic needs to be combined with low-level models of text and spoken signals.

3.4 Language and communication

In the field of artificial intelligence, processing language is considered to be of fundamental importance, with a distinction being made here between two fields: computational linguistics (CL) and natural language processing (NLP). In short, the difference is that CL research focuses on using computers for language processing purposes, while NLP consists of all applications, including machine translation (MT), Q&A, document summarization, information extraction, to name but a few. In other words, NLP requires a specific task and is not a research discipline per se. NLP comprises:

Part-of-speech tagging
Natural language understanding
Natural language generation
Automatic summarization
Named-entity recognition
Parsing
Voice recognition
Sentiment analysis
Language, topic, and word segmentation
Co-reference resolution
Discourse analysis
Machine translation
Word sense disambiguation
Morphological segmentation
Answers to questions
Relationship extraction
Sentence splitting

The core vision of AI says that a version of first-order predicate logic (“first-order predicate calculus” or “FOPC”) supported by the necessary mechanisms for the respective problem is sufficient for representing language and knowledge. This thesis says that logic can and should supply the semantics underlying natural language. Although attempts to use a form of logical semantics as the key to representing contents have made progress in the field of AI and linguistics, they have had little success with regard to a program that can translate English into formal logic. To date, the field of psychology has also failed to provide proof that this type of translation into logic corresponds to the way in which people store and manipulate “meaning.” Consequently, the ability to translate a language into FOPC continues to be an elusive goal. Without a doubt, there are NLP applications that need to establish logical inferences between sentence representations, but if these are only one part of an application, it is not clear that they have anything to do with the underlying meaning of the corresponding natural language (and consequently with CL/NLP), since the original task for logical structures was inference. These and other considerations have crystallized into three different positions:

Position 1: Logical inferences are tightly linked to the meaning of sentences, because knowing their meaning is equivalent to deriving inferences and logic is the best way to do this.
Position 2: A meaning exists outside logic, which postulates a number of semantic markers or primes that are appended to words in order to express their meaning – this is prevalent today in the form of annotations.
Position 3: In general, the predicates of logic and formal systems only appear to be different from human language, but their terms are in actuality the words as which they appear

The introduction of statistical and AI methods into the field is the latest trend within this context. The general strategy is to learn how language is processed – ideally in the way that humans do this, although this is not a basic prerequisite. In terms of ML, this means learning based on extremely large corpora that have been translated manually by humans. This often means that it is necessary to learn (algorithmically) how annotations are assigned or how part-of-speech categories (the classification of words and punctuation marks in a text into word types) or semantic markers or primes are added to corpora, all based on corpora that have been prepared by humans (and are therefore correct). In the case of supervised learning, and with reference to ML, it is possible to learn potential associations of part-of-speech tags with words that have been annotated by humans in the text, so that the algorithms are also able to annotate new, previously unknown texts. [18] This works the same way for lightly supervised and unsupervised learning, such as when no annotations have been made by humans and the only data presented is a text in a language with texts with identical contents in other languages or when relevant clusters are found in thesaurus data without there being a defined goal. [19] With regard to AI and language, information retrieval (IR) and information extraction (IE) play a major role and correlate very strongly with each other. One of the main tasks of IR is grouping texts based on their content, whereas IE extracts similarly factual elements from texts or is used to be able to answer questions concerning text contents. These fields therefore correlate very strongly with each other, since individual sentences (not only long texts) can also be regarded as documents. These methods are used, for example, in interactions between users and systems, such as when a driver asks the on-board computer a question regarding the owner’s manual during a journey – once the language input has been converted into text, the question’s semantic content is used as the basis for finding the answer in the manual, and then for extracting the answer and returning it to the driver.

3.5 Agents and actions

In traditional AI, people focused primarily on individual, isolated software systems that acted relatively inflexibly to predefined rules. However, new technologies and applications have established a need for artificial entities that are more flexible, adaptive, and autonomous, and that act as social units in multi-agent systems. In traditional AI (see also “physical symbol system hypothesis”[20] that has been embedded into so-called “deliberative” systems), an action theory that establishes how systems make decisions and act is represented logically in individual systems that must execute actions. Based on these rules, the system must prove a theorem – the prerequisite here being that the system must receive a description of the world in which it currently finds itself, the desired target state, and a set of actions, together with the prerequisites for executing these actions and a list of the results for each action. It turned out that the computational complexity involved rendered any system with time limits useless even when dealing with simple problems, which had an enormous impact on symbolic AI, resulting in the development of reactive architectures. These architectures follow if-then rules that translate inputs directly into tasks. Such systems are extremely simple, although they can solve very complex tasks. The problem is that such systems learn procedures rather than declarative knowledge, i.e., they learn attributes that cannot easily be generalized for similar situations. Many attempts have been made to combine deliberative and reactive systems, but it appears that it is necessary to focus either on impractical deliberative systems or on very loosely developed reactive systems – focusing on both is not optimal.

Principles of the new, agent-centered approach

The agent-oriented approach is characterized by the following principles:

Autonomous behavior:

“Autonomy” describes the ability of systems to make their own decisions and execute tasks on behalf of the system designer. The goal is to allow systems to act autonomously in scenarios where controlling them directly is difficult. Traditional software systems execute methods after these methods have been called, i.e., they have no choice, whereas agents make decisions based on their beliefs, desires, and intentions (BDI)[21].

Adaptive behavior:

Since it is impossible to predict all the situations that agents will encounter, these agents must be able to act flexibly. They must be able to learn from and about their environment and adapt accordingly. This task is all the more difficult if not only nature is a source of uncertainty, but the agent is also part of a multi-agent system. Only environments that are not static and self-contained allow for an effective use of BDI agents – for example, reinforcement learning can be used to compensate for a lack of knowledge of the world. Within this context, agents are located in an environment that is described by a set of possible states. Every time an agent executes an action, it is “rewarded” with a numerical value that expresses how good or bad the action was. This results in a series of states, actions, and rewards, and the agent is compelled to determine a course of action that entails maximization of the reward.

Social behavior:

In an environment where various entities act, it is necessary for agents to recognize their adversaries and form groups if this is required by a common goal. Agent-oriented systems are used for personalizing user interfaces, as middleware, and in competitions such as the RoboCup. In a scenario where there are only self-driving cars on roads, the individual agent’s autonomy is not the only indispensable component – car2car communications, i.e., the exchange of information between vehicles and acting as a group on this basis, are just as important. Coordination between the agents results in an optimized flow of traffic, rendering traffic jams and accidents virtually impossible (see also section 5.1, “Vehicles as autonomous, adaptive, and social agents & cities as super-agents”).

In summary, this agent-oriented approach is accepted within the AI community as the direction of the future.

Multi-agent behavior

Various approaches are being pursued for implementing multi-agent behavior, with the primary difference being in the degree of control that designers have over individual agents.[22]^,[23],[24] A distinction is made here between:

Distributed problem-solving systems (DPS)
Multi-agent systems (MAS)

DPS systems allow the designer to control each individual agent in the domain, with the solution to the task being distributed among multiple agents. In contrast, MAS systems have multiple designers, each of whom can only influence their own agents with no access to the design of any other agent. In this case, the design of the interaction protocols is extremely important. In DPS systems, agents jointly attempt to achieve a goal or solve a problem, whereas, in MAS systems, each agent is individually motivated and wants to achieve its own goal and maximize its own benefit. The goal of DPS research is to find collaboration strategies for problem-solving, while minimizing the level of communication required for this purpose. Meanwhile, MAS research is looking at coordinated interaction, i.e., how autonomous agents can be brought to find a common basis for communication and undertake consistent actions.[25] Ideally, a world in which only self-driving cars use the road would be a DPS world. However, the current competition between OEMs means that a MAS world will come into being first. In other words, communication and negotiation between agents will take center stage (see also Nash equilibrium).

Multi-agent learning

Multi-agent learning (MAL) has only relatively recently been bestowed a certain degree of attention.[26]^{,[27],[28],[29]} The key problems in this area include determining which techniques should be used and what exactly “multi-agent learning” means. Current ML approaches were developed in order to train individual agents, whereas MAL focuses first and foremost on distributed learning. “Distributed” does not necessarily mean that a neural network is used, in which many identical operations run during training and can accordingly be parallelized, but instead that:

A problem is split into subproblems and individual agents learn these subproblems in order to solve the main problem using their combined knowledge OR
Many agents try to solve the same problem independently of each other by competing with each other

Reinforcement learning is one of the approaches being used in this context.[30]

4 Data mining and artificial intelligence in the automotive industry

At a high level of abstraction, the value chain in the automotive industry can broadly be described with the following subprocesses:

Development
Procurement
Logistics
Production
Marketing
Sales, after-sales, and retail
Connected customer

Each of these areas already features a significant level of complexity, so the following description of data mining and artificial intelligence applications has necessarily been restricted to an overview.

4.1 Development

Vehicle development has become a largely virtual process that is now the accepted state of the art for all manufacturers. CAD models and simulations (typically of physical processes, such as mechanics, flow, acoustics, vibration, etc., on the basis of finite element models) are used extensively in all stages of the development process.

The subject of optimization (often with the use of evolution strategies[31] or genetic algorithms and related methods) is usually less well covered, even though it is precisely here in the development process that it can frequently yield impressive results. Multi-disciplinary optimization, in which multiple development disciplines (such as occupant safety and noise, vibration, and harshness (NVH)) are combined and optimized simultaneously, is still rarely used in many cases due to supposedly excessive computation time requirements. However, precisely this approach offers enormous potential when it comes to agreeing more quickly and efficiently across the departments involved on a common design that is optimal in terms of the requirements of multiple departments.

In terms of the analysis and further use of simulation results, data mining is already being used frequently to generate co-called “response surfaces.” In this application, data mining methods (the entire spectrum, ranging from linear models to Gaussian processes, support vector machines, and random forests) are used in order to learn a nonlinear regression model as an approximation of the representation of the input vectors for the simulation based on the relevant (numerical) simulation results[32]. Since this model needs to have good interpolation characteristics, cross-validation methods that allow the model’s prediction quality for new input vectors to be estimated are typically used for training the algorithms. The goal behind this use of supervised learning methods is frequently to replace computation-time-consuming simulations with a fast approximation model that, for example, represents a specific component and can be used in another application. In addition, this allows time-consuming adjustment processes to be carried out faster and with greater transparency during development.

One example: It is desirable to be able to immediately evaluate the forming feasibility[33] of geometric variations in components during the course of an interdepartmental meeting instead of having to run complex simulations and wait one or two days for the results. A response surface model that has been previously trained using simulations can immediately provide a very good approximation of the risk of excessive thinning or cracks in this type of meeting, which can then be used immediately for evaluating the corresponding geometry.

These applications are frequently focused on or limited to specific development areas, which, among other reasons, is due to the fact that simulation data management, in its role as a central interface between data generation and data usage and analysis, constitutes a bottleneck. This applies especially when simulation data is intended for use across multiple departments, variants, and model series, as is essential for real use of data in the sense of a continuously learning development organization. The current situation in practice is that department-specific simulation data is often organized in the form of file trees in the respective file system within a department, which makes it difficult to access for an evaluation based on machine learning methods. In addition, simulation data may already be very voluminous for an individual simulation (in the range of terabytes for the latest CFD simulations), so efficient storage solutions are urgently required for machine-learning-based analyses.

While simulation and the use of nonlinear regression models limited to individual applications have become the standard, the opportunities offered by optimizing analytics are rarely being exploited. Particularly with regard to such important issues as multi-disciplinary (as well as cross-departmental) machine learning, learning based on historical data (in other words, learning from current development projects for future projects), and cross-model learning, there is an enormous and completely untapped potential for increasing efficiency.

4.2 Procurement

The procurement process uses a wide variety of data concerning suppliers, purchase prices, discounts, delivery reliability, hourly rates, raw material specifications, and other variables. Consequently, computing KPIs for the purpose of evaluating and ranking suppliers poses no problem whatsoever today. Data mining methods allow the available data to be used, for example, to generate forecasts, to identify important supplier characteristics with the greatest impact on performance criteria, or to predict delivery reliability. In terms of optimizing analytics, the specific parameters that an automotive manufacturer can influence in order to achieve optimum conditions are also important.

Overall, the finance business area is a very good field for optimizing analytics, because the available data contains information about the company’s main success factors. Continuous monitoring[34] is worth a brief mention as an example, here with reference to controlling. This monitoring is based on finance and controlling data, which is continuously prepared and reported. This data can also be used in the sense of predictive analytics in order to automatically generate forecasts for the upcoming week or month. In terms of optimizing analytics, analyses of the key influencing parameters, together with suggested optimizing actions, can also be added to the aforementioned forecasts.

These subject areas are more of a vision than a reality at present, but they do convey an idea of what could be possible in the fields of procurement, finance, and controlling.

4.3 Logistics

In the field of logistics, a distinction can be made between procurement logistics, production logistics, distribution logistics, and spare parts logistics.

Procurement logistics considers the process chain extending from the purchasing of goods through to shipment of the material to the receiving warehouse. When it comes to the purchasing of goods, a large amount of historical price information is available for data mining purposes, which can be used to generate price forecasts and, in combination with delivery reliability data, to analyze supplier performance. As for shipment, optimizing analytics can be used to identify and optimize the key cost factors.

A similar situation applies to production logistics, which deals with planning, controlling, and monitoring internal transportation, handling, and storage processes. Depending on the granularity of the available data, it is possible to identify bottlenecks, optimize stock levels, and minimize the time required, for example here.

Distribution logistics deals with all aspects involved in transporting products to customers, and can refer to both new and used vehicles for OEMs. Since the primary considerations here are the relevant costs and delivery reliability, all the subcomponents of the multimodal supply chain need to be taken into account – from rail to ship and truck transportation through to subaspects such as the optimal combination of individual vehicles on a truck. In terms of used-vehicle logistics, optimizing analytics can be used to assign vehicles to individual distribution channels (e.g., auctions, Internet) on the basis of a suitable, vehicle-specific resale value forecast in order to maximize total sale proceeds. GM implemented this approach as long ago as 2003 in combination with a forecast of expected vehicle-specific sales revenue[35].

In spare parts logistics, i.e., the provision of spare parts and their storage, data-driven forecasts of the number of spare parts needing to be held in stock depending on model age and model (or sold volumes) are one important potential application area for data mining, because it can significantly decrease the storage costs.

As the preceding examples show, data analytics and optimization must frequently be coupled with simulations in the field of logistics, because specific aspects of the logistics chain need to be simulated in order to evaluate and optimize scenarios. Another example is the supplier network, which, when understood in greater depth, can be used to identify and avoid critical paths in the logistics chain, if possible. This is particularly important, as the failure of a supplier to make a delivery on the critical path would result in a production stoppage for the automaker. Simulating the supplier network not only allows this type of bottleneck to be identified, but also countermeasures to be optimized. In order to facilitate a simulation that is as detailed and accurate as possible, experience has shown that mapping all subprocesses and interactions between suppliers in detail becomes too complex, as well as nontransparent for the automobile manufacturer, as soon as attempts are made to include Tier 2 and Tier 3 suppliers as well.

This is why data-driven modeling should be considered as an alternative. When this approach is used, a model is learned from the available data about the supplier network (suppliers, products, dates, delivery periods, etc.) and the logistics (stock levels, delivery frequencies, production sequences) by means of data mining methods. The model can then be used as a forecast model in order, for example, to predict the effects of a delivery delay for specific parts on the production process. Furthermore, the use of optimizing analytics in this case makes it possible to perform a worst-case analysis, i.e., to identify the parts and suppliers that would bring about production stoppages the fastest if their delivery were to be delayed. This example very clearly shows that optimization, in the sense of scenario analysis, can also be used to determine the worst-case scenario for an automaker (and then to optimize countermeasures in future).

4.4 Production

Every sub-step of the production process will benefit from the consistent use of data mining. It is therefore essential for all manufacturing process parameters to be continuously recorded and stored. Since the main goal of optimization is usually to improve quality or reduce the incidence of defects, data concerning the defects that occur and the type of defect is required, and it must be possible to clearly assign this data to the process parameters. This approach can be used to achieve significant improvements, particularly in new types of production process – one example being CFRP[36]. Other potential optimization areas include energy consumption and the throughput of a production process per time unit. Optimizing analytics can be applied both offline and online in this context.

When used in offline applications, the analysis identifies variables that have a significant influence on the process. Furthermore, correlations are derived between these influencing variables and their targets (quality, etc.) and, if applicable, actions are also derived from this, which can improve the targets. Frequently, such analyses focus on a specific problem or an urgent issue with the process and can deliver a solution very efficiently – however, they are not geared towards continuous process optimization. Conducting the analyses and interpreting and implementing the results consistently requires manual sub-steps that can be carried out by data scientists or statisticians – usually in consultation with the respective process experts.

In the case of online applications, there is a very significant difference in the fact that the procedure is automated, resulting in completely new challenges for data acquisition and integration, data pre-processing, modeling, and optimization. In these applications, even the provision of process and quality data needs to be automated, as this provides integrated data that can be used as a basis for modeling at any time. This is crucial given that modeling always needs to be performed when changes to the process (including drift) are detected. The resulting forecast models are then used automatically for optimization purposes and are capable of, e.g., forecasting the quality and suggesting (or directly implementing) actions for optimizing the relevant target variable (quality in this case) even further. This implementation of optimizing analytics, with automatic modeling and optimization, is technically available, although it is more a vision than a reality for most users today.

The potential applications include forming technology (conventional as well as for new materials), car body manufacture, corrosion protection, painting, drive trains, and final assembly, and can be adapted to all sub-steps. An integrated analysis of all process steps, including an analysis of all potential influencing factors and their impact on overall quality, is also conceivable in future – in this case, it would be necessary to integrate the data from all subprocesses.

4.5 Marketing

The focus in marketing is to reach the end customer as efficiently as possible and to convince people either to become customers of the company or to remain customers. The success of marketing activities can be measured in sales figures, whereby it is important to differentiate marketing effects from other effects, such as the general financial situation of customers. Measuring the success of marketing activities can therefore be a complex endeavor, since multivariate influencing factors can be involved.

It would also be ideal if optimizing analytics could always be used in marketing, because optimization goals, such as maximizing return business from a marketing activity, maximizing sales figures while minimizing the budget employed, optimizing the marketing mix, and optimizing the order in which things are done, are all vital concerns. Forecast models, such as those for predicting additional sales figures over time as a result of a specific marketing campaign, are only one part of the required data mining results – multi-criteria decision-making support also plays a decisive role in this context.

Two excellent examples of the use of data mining in marketing are the issues of churn (customer turnover) and customer loyalty. In a saturated market, the top priority for automakers is to prevent loss of custom, i.e., to plan and implement optimal countermeasures. This requires information that is as individualized as possible concerning the customer, the customer segment to which the customer belongs, the customer’s satisfaction and experience with their current vehicle, and data concerning competitors, their models, and prices. Due to the subjectivity of some of this data (e.g., satisfaction surveys, individual satisfaction values), individualized churn predictions and optimal countermeasures (e.g., personalized discounts, refueling or cash rewards, incentives based on additional features) are a complex subject that is always relevant.

Since maximum data confidentiality is guaranteed and no personal data is recorded – unless the customer gives their explicit consent in order to receive offers as individually tailored as possible – such analyses and optimizations are only possible at the level of customer segments that represent the characteristics of an anonymous customer subset.

Customer loyalty is closely related to this subject, and takes on board the question of how to retain and optimize, i.e., increase the loyalty of existing customers. Likewise, the topic of “upselling,” i.e., the idea of offering existing customers a higher-value vehicle as their next one and being successful with this offer, is always associated with this. It is obvious that such issues are complex, as they require information about customer segments, marketing campaigns, and correlated sales successes in order to facilitate analysis. However, this data is mostly not available, difficult to collect systematically, and characterized by varying levels of veracity, i.e., uncertainty in the data.

Similar considerations apply to optimizing the marketing mix, including the issue of trade fair participation. In this case, data needs to be collected over longer periods of time, so that it can be evaluated and conclusions can be drawn. For individual marketing campaigns such as mailing campaigns, evaluating the return business rate with regard to the characteristics of the selected target group is a much more likely objective of a data analysis and corresponding campaign optimization.

In principle, very promising potential applications for optimizing analytics can also be found in the marketing field. However, the complexity involved in data collection and data protection, as well as the partial inaccuracy of data collected, means that a long-term approach with careful planning of the data collection strategy is required. The issue becomes even more complex if “soft” factors such as brand image also need to be taken into account in the data mining process – in this case, all data has a certain level of uncertainty, and the corresponding analyses (“What are the most important brand image drivers?” “How can the brand image be improved?”) are more suitable for determining trends than drawing quantitative conclusions. Nevertheless, within the scope of optimization, it is possible to determine whether an action will have a positive or negative impact, thereby allowing the direction to be determined, in which actions should go.

4.6 Sales, after-sales, and retail

The diversity of potential applications and existing applications in this area is significant. Since the “human factor,” embodied by the end customer, plays a crucial role within this context, it is not only necessary to take into account objective data such as sales figures, individual price discounts, and dealer campaigns; subjective customer data such as customer satisfaction analyses based on surveys or third-party market studies covering such subjects as brand image, breakdown rates, brand loyalty, and many others may also be required. At the same time, it is often necessary to procure and integrate a variety of data sources, make them accessible for analysis, and finally analyze them correctly in terms of the potential subjectivity of the evaluations[37] – a process that currently depends to a large extent on the expertise of the data scientists conducting the analysis.

The field of sales itself is closely intermeshed with marketing. After all, the ultimate objective is to measure the success of marketing activities in terms of turnover based on sales figures. A combined analysis of marketing activities (including distribution among individual media, placement frequency, costs of the respective marketing activities, etc.) and sales can be used to optimize market activities in terms of cost and effectiveness, in which case a portfolio-based approach is always used. This means that the optimum selection of a portfolio of marketing activities and their scheduling – and not just focusing on a single marketing activity – is the main priority. Accordingly, the problem here comes from the field of multi-criteria decision-making support, in which decisive breakthroughs have been made in recent years thanks to the use of evolutionary algorithms and new, portfolio-based optimization criteria. However, applications in the automotive industry are still restricted to a very limited scope.

Similarly, customer feedback, warranty repairs, and production are potentially intermeshed as well, since customer satisfaction can be used to derive soft factors and warranty repairs can be used to derive hard factors, which can then be coupled with vehicle-specific production data and analyzed. In this way, factors that affect the occurrence of quality defects not present or foreseeable at the factory can be determined. This makes it possible to forecast such quality defects and use optimizing analytics to reduce their occurrence. Nevertheless, it is also necessary to combine data from completely different areas – production, warranty, and after-sales – in order to make it accessible to the analysis.

In the case of used vehicles, residual value plays a vital role in a company’s fleet or rental car business, as the corresponding volumes of tens of thousands of vehicles are entered into the balance sheet as assets with the corresponding residual value. Today, OEMs typically transfer this risk to banks or leasing companies, although these companies may in turn be part of the OEM’s corporate group. Data mining and, above all, predictive analytics can play a decisive role here in the correct evaluation of assets, as shown by an American OEM as long as ten years ago[38]. Nonlinear forecasting models can be used with the company’s own sales data to generate individualized, equipment-specific residual value forecasts at the vehicle level, which are much more accurate than the models currently available as a market standard. This also makes it possible to optimize distribution channels – even as far as geographically assigning used vehicles to individual auction sites at the vehicle level – in such a way as to maximize a company’s overall sales success on a global basis.

Considering sales operations in greater detail, it is obvious that knowledge regarding each individual customer’s interests and preferences when buying a vehicle or, in future, temporarily using available vehicles, is an important factor. The more individualized the knowledge concerning the sociodemographic factors for a customer, their buying behavior, or even their clicking behavior on the OEM’s website, as well as their driving behavior and individual use of a vehicle, the more accurately it will be possible to address their needs and provide them with an optimal offer for a vehicle (suitable model with appropriate equipment features) and its financing.

4.7 Connected customer

While this term is not yet established as such at present, it does describe a future in which both the customer and their vehicle are fully integrated with state-of-the-art information technology. This aspect is closely linked to marketing and sales issues, such as customer loyalty, personalized user interfaces, vehicle behavior in general, and other visionary aspects (see also section 5). With a connection to the Internet and by using intelligent algorithms, a vehicle can react to spoken commands and search for answers that, for example, can communicate directly with the navigation system and change the destination. Communication between vehicles makes it possible to collect and exchange information on road and traffic conditions, which is much more precise and up-to-date than that which can be obtained via centralized systems. One example is the formation of black ice, which is often very localized and temporary, and which can be detected and communicated in the form of a warning to other vehicles very easily today.

5 Vision

Vehicle development already makes use of “modular systems” that allow components to be used across multiple model series. At the same time, development cycles are becoming increasingly shorter. Nevertheless, the field of virtual vehicle development has not yet seen any effective attempts to use machine learning methods in order to facilitate automatic learning that extracts both knowledge that is built upon other historical knowledge and knowledge that applies to more than one model series so as to assist with future development projects and organizing them more efficiently. This topic is tightly intermeshed with that of data management, the complexity of data mining in simulation and optimization data, and the difficulty in defining a suitable representation of knowledge concerning vehicle development aspects. Furthermore, this approach is restricted by the organizational limitations of the vehicle development process, which is often still exclusively oriented towards the model being developed. Moreover, due to the heterogeneity of data (often numerical data, but also images and videos, e.g., from flow fields) and the volume of data (now in the terabyte range for a single simulation), the issue of “data mining in simulation data” is extremely complex and, at best, the object of tentative research approaches at this time[39].

New services are becoming possible due to the use of predictive maintenance. Automatically learned knowledge regarding individual driving behavior – i.e., annual, seasonal, or even monthly mileages, as well as the type of driving – can be used to forecast intervals for required maintenance work (brake pads, filters, oil, etc.) with great accuracy. Drivers can use this information to schedule garage appointments in a timely manner, and the vision of a vehicle that can schedule garage appointments by itself in coordination with the driver’s calendar – which is accessible to the on-board computer via appropriate protocols – is currently more realistic than the often cited refrigerator that automatically reorders groceries.

In combination with automatic optimization, the local authorized repair shop, in its role as a central coordination point where individual vehicle service requests arrive via an appropriate telematics interface, can optimally schedule service appointments in real time – keeping workloads as evenly distributed as possible while taking staff availability into account, for example.

With regard to the vehicle’s learning and adaptation abilities, there is virtually limitless potential. Vehicles can identify and classify their drivers’ driving behavior – i.e., assign them to a specific driver type. Based on this, the vehicles themselves can make adjustments to systems ranging from handling to the electronic user interface – in other words, they can offer drivers individualization and adaptation options that extend far beyond mere equipment features. The learned knowledge about the driver can then be transferred to a new vehicle when one is purchased, ensuring that the driver’s familiar environment is immediately available again.

5.1 Vision – Vehicles as autonomous, adaptive, and social agents & cities as super-agents

Research into self-driving cars is here to stay in the automotive industry, and the “mobile living room” is no longer an implausible scenario, but is instead finding a more and more positive response. Today, the focus of development is on autonomy, and for good reason: In most parts of the world, self-driving cars are not permitted on roads, and if they are, they are not widespread. This means that the vehicle as an agent cannot communicate with all other vehicles, and that vehicles driven by humans adjust their behavior based on the events in their drivers’ field of view. Navigation systems offer support by indicating traffic congestion and suggesting alternative routes. However, we now assume that every vehicle is a fully connected agent, with the two primary goals of:

Contributing to optimizing the flow of traffic
Preventing accidents

In this scenario, agents communicate with each other and negotiate routes with the goal of minimizing total travel time (obvious parameters being, for example, the route distance, the possible speed, roadworks, etc.). Unforeseeable events are minimized, although not eliminated completely – for example, storm damage would still result in a road being blocked. This type of information would then need to be communicated immediately to all vehicles in the relevant action area, after which a new optimization cycle would be required. Moreover, predictive maintenance minimizes damage to vehicles. Historical data is analyzed and used to predict when a defect would be highly likely to occur, and the vehicle (the software in the vehicle, i.e., the agent) makes a service appointment without requiring any input from the passenger and then drives itself to the repair shop –all made possible with access to the passenger’s calendar. In the event of damage making it impossible to continue a journey, this would also be communicated as quickly as possible – either with a “breakdown” broadcast or to a control center, and a self-driving tow truck would be immediately available to provide assistance, ideally followed by a (likewise self-driving) replacement vehicle. In other words, vehicles act:

Autonomously in the sense that they automatically follow a route to a destination
Adaptively in the sense that they can react to unforeseen events, such as road closures and breakdowns
Socially in the sense that they work together to achieve the common goals of optimizing the flow of traffic and preventing accidents (although the actual situation is naturally more complex and many subgoals need to be defined in order for this to be achieved).

In combination with taxi services that would also have a self-driving fleet, it would be possible to use travel data and information on the past use of taxi services (provided that the respective user gives their consent) in order to send taxis to potential customers at specific locations without the need for these customers to actively request the taxis. In a simplified form, it would also be possible to implement this in an anonymized manner, for example, by using data to identify locations where taxis are frequently needed (as identified with the use of clusters; see also section 3.1, “Machine learning”) at specific times or for specific events (rush hour, soccer matches, etc.).

If roads become digital as well, i.e., if asphalt roads are replaced with glass and supplemented with OLED technology, dynamic changes to traffic management would also be possible. From a materials engineering perspective, this is feasible:

The surface structure of glass can be developed in such a way as to be skid resistant, even in the rain.
Glass can be designed to be so flexible and sturdy that it will not break, even when trucks drive over it.
The waste heat emitted by displays can be used to heat roads and prevent ice from forming during winter.

In this way, cities themselves can be embedded as agents in the multi-agent environment and help achieve the defined goals.

5.2 Vision – integrated factory optimization

By using software to analyze customer and repair shop reports and repair data concerning defects occurring in the field, we can already automatically analyze whether an increase in defects can be expected for specific vehicle models or installed parts. The goal here is to identify and avoid potential problems at an early stage, before large-scale recall actions need to be initiated. Causes of defects in the field can be manifold, including deficient quality of the parts being used or errors during production, which, together with the fact that thousands of vehicles leave Volkswagen production plants every day, makes it clear that acting quickly is of utmost importance. Assuming that at present (November 2015), the linguistic analysis of customer statements and repair shop reports shows that a significant increase in right-hand-side parking light failures can be expected for model x, platform C vehicles delivered from July 2015 onwards. In this case, “significant” means that a statistically verifiable upwards trend (increase in reported malfunctions) can be extracted based on vehicle sales between January 2015 and November 2015. By analyzing fault chains and repair chains, it is possible to determine which events will result in a fault or defect or which other models are or will be affected. If the error in production is caused by a production robot, for example, this can be traced back to a hardware fault and/or software error or to an incorrect or incomplete configuration. In the worst-case scenario, it may even be necessary to update the control system in order to eliminate the error. In addition, it is not possible to update software immediately, because work on patches can only begin after the manufacturer has received and reviewed the problem report. Likewise, reconfiguring the robot can be a highly complex task due to the degrees of freedom (axes of rotation) that multi-axis manipulators have at their disposal. In short, making such corrections is time-consuming, demanding, and, in all but ideal scenarios, results in subsequent issues.

Artificial intelligence (AI) approaches can be used to optimize this process at several points.

One of the areas being addressed by AI research is enabling systems (where the term “system” is a synonym for “a whole consisting of multiple individual parts,” especially in the case of software or combinations of hardware and software, such as those found in industrial robots) to automatically extract and interpret knowledge from data, although the extent of this is still limited at present.

Figure 4 – Data, knowledge, action

In contrast to data, knowledge can form the basis for an action, and the result of an action can be fed back into data, which then forms the basis for new knowledge, and so on.

If an agent with the ability to learn and interpret data is supplied with the results (state of the world before the action, state of the world after the action; see also section 3.5) of its own actions or of the actions of other agents, the agent, provided it has a goal and the freedom to adapt as necessary, will attempt to achieve its goal autonomously. Biological agents, such as humans and animals, do this intuitively without needing to actively control or monitor the process of transforming data into knowledge. If, for instance, wood in a DIY project splits because we hammered in a nail too hard at an excessively acute angle, our brain subconsciously transforms the angle, the material’s characteristics, and the force of the hammer blow into knowledge and experience, minimizing the likelihood of us repeating the same mistake.

In the previously discussed, specific area of artificial intelligence referred to as “machine learning,” research is focused on emulating such behavior. Using ML to enable software to learn from data in a specific problem domain and to infer how to solve new events on the basis of past events opens up a world of new possibilities. ML is nothing new in the field of data analysis, where it has been used for many years now. What is new is the possibility to compute highly complex models with data volumes in the petabyte range within a specific time limit. If one thinks of a production plant as an organism pursuing the objective of producing defect-free vehicles, it is clear that granting this organism access to relevant data would help the organism with its own development and improvement, provided, of course, that this organism has the aforementioned capabilities.

Two stages of development are relevant in this case:

Stage 1 – Learning from data and applying experiences

In order to learn from data, a robot must not just operate according to static programming, it must also be able to use ML methods to work autonomously towards defined learning goals. With regard to any production errors that may occur, this means, first and foremost, that the actions being carried out that result in these errors will have been learned, and not programmed based on a flowchart and an event diagram. Assume, for example, that the aforementioned parking light problem has not only been identified, but that its cause can also been traced back to an issue in production, e.g., a robot that is pushing a headlamp into its socket too hard. All that’s now required is to define the learning goal for the corrective measure. Let us also assume that the production error is not occurring with robots in other production plants, and that left-hand headlamps are being installed correctly in general. In the best-case scenario, we, as humans, would be able to visually recognize and interpret the difference between robots that are working correctly and robots that are not – and the robot making the mistake should be able to learn in a similar way. The difference here is in the type of perception involved – digital systems can “see” much better than us in such cases. Even though the internal workings of ML methods implemented by means of software are rarely completely transparent during the learning process – even for the developer of the learning system – due to the stochastic components and complexity involved, the action itself is transparent, i.e., not how a system does something, but what it does. These signals need to be used in order to initiate the learning process anew and to adapt the control system of the problematic robot. In the aforementioned case, these would be the manipulator and effector motion signals of a robot that is working correctly, which can be measured and defined with any desired level of accuracy. This does not require any human intervention, as the system’s complete transparency is ensured by continuously securing and analyzing the data accrued in the production process. Neither is any human analysis required in the identification and transmission of defects from the field. Based on linguistic analyses of repair shop and customer reports, together with repair data, we can already swiftly identify which problems are attributable to production. The corresponding delivery of this data to the relevant agents (the data situation makes it possible to determine exactly which machine needs to correct itself) allows these agents to learn from the defects and correct themselves.

Stage 2 – Overcoming the limitations of programming – smart factories as individuals

What if the production plant needs to learn things for which even the flexibility of one or more ML methods used by individual agents (such as production or handling robots) is insufficient? Just like a biological organism, a production plant could act as a separate entity composed of subcomponents, similarly to a human, who can be addressed using natural language, understands context, and is capable of interpreting this. The understanding and interpretation of context have always been a challenge in the field of AI research. AI theory views context as a shared (or common) interpretation of a situation, with the context of a situation and the context of an entity relative to a situation being relevant here. Contexts relevant to a production plant include everything that is relevant to production when expressed in natural language or any other way. The following simplified scenario helps in understanding the concept: Let us assume that the final design for a car body is agreed upon by a committee during a meeting.

“We decided on the following body for the Golf 15 facelift. Please build a prototype accordingly based on the Golf 15,” says Mr. Müller while looking at the 3-D model that seems to be floating in front of everyone at the meeting and can only be seen with augmented reality glasses.

In this scenario, use of evolutionary algorithms for simulation is conceivable, limited to the possible combinations that can actually be built. Provided that the required computing power is available and the parameters involved have been reduced, this can cut simulation times from several hours to minutes, making dynamic morphing of components or component combinations possible during a meeting.

Factory : “Based on the model input, I determined that it will take 26 minutes to adjust the programming of my robots. In order to assemble the floor assembly, tools x₁, y₁ must be replaced with tools x₂, y₂ on robots x, y. Production of the prototype will be completed in 6 hours and 37 minutes.”

Of course, this scenario is greatly simplified, but it should still show what the future may hold. In order to understand what needs to be done, the production plant must understand what a car body is, what a facelift is, etc. and interpret the parameters and output from a simulation in such a way that they can be converted into production steps. This conversion into production steps requires the training of individual ML components on the robots or the adaptation/enhancement of their programs based on the simulation data, so that all steps can be carried out, from cutting sheet metal to assembling and integrating the (still fictitious) Golf 15 basic variant. And while this encompasses a wide range of methods, extending from natural language understanding and language generation through to planning, optimization, and autonomous model generation, it is by no means mere science fiction.

5.3 Vision – companies acting autonomously

When planning marketing activities or customer requirements, for example, it is imperative for companies of all types to monitor how sales change over time, to predict how markets will develop and which customers will potentially be lost, to respond to financial crises, and to quickly interpret the potential impact of catastrophes or political structures. We already do all this today, and what we need for it is data. We are not interested in the personal data of individuals, but in what can be derived from many individual components. For example, by analyzing over 1,600 indicators, we can predict how certain financial indicators for markets will move and respond accordingly or we can predict, with a high probability of being correct, which customer groups find models currently in pre-production development appealing and then derive marketing actions accordingly. In fact, we can go so far as to determine fully configured models to suit the tastes of specific customer groups. With this knowledge, we make decisions, adjust our production levels, prepare marketing plans, and propose fully configured models appropriate for specific customer groups in the configurator.

Preparing a marketing plan sometimes follows a static process (what needs to be done), but how something is done remains variable. As soon as it is possible to explain to another person how and why something is being done, this information can also be made available to algorithms. Breaking it down into an example, we can predict that one of our competitors opening a new production plant in a country where we already have manufacturing operations would result in us having to expect a drop in our sales. We can even predict (within a certain fluctuation range) how large this expected drop in sales would be. In this case, “we” refers to the algorithms that we have developed for this specific use case. Since this situation occurs more than once and requires (virtually) identical input parameters every time, we can use the same algorithms to predict events in other countries. This makes it possible to use knowledge from past marketing campaigns in order to conduct future campaigns. In short, algorithms would prepare the marketing plans.

When provided with a goal, such as maximizing the benefit for our customers while taking account of cost-effectiveness, algorithm sub-classes would be able to take internal data (such as sales figures and configurator data) and external data (such as stock market trends, financial indicators, political structures) to autonomously generate output for a “marketing program” or “GDP program.” If a company were allowed to use its resources and act autonomously, then it would be able to react autonomously to fluctuations in markets, subsidize vulnerable suppliers, and much more. The possibilities are wide-ranging, and such scenarios can already be technically conceived today. Continuously monitoring stock prices, understanding and interpreting news items, and taking into account demographic changes are only a few of the areas that are relevant and that, among other things, require a combination of natural language understanding, knowledge-based systems, and the ability to make logical inferences. In the field of AI research, language and visual information are very frequently used as the basis for understanding things, because we humans also learn and understand a great deal using language and visual stimuli.

6 Conclusions

Artificial intelligence has already found its way into our daily lives, and is no longer solely the subject of science fiction novels. At present, AI is used primarily in the following areas:

Analytical data processing
Domains in which qualified decisions need to be made quickly on the basis of a large amount of (often heterogeneous) data
Monotonous activities that still require constant alertness

In the field of analytical data processing, the next few years will see us transition from exclusive use of decision-support systems to additional use of systems that make decisions on our behalf. Particularly in the field of data analysis, we are currently developing individual analytical solutions for specific problems, although these solutions cannot be used across different contexts – for example, a solution developed to detect anomalies in stock price movements cannot be used to understand the contents of images. This will remain the case in the future, although AI systems will integrate individual interacting components and consequently be able to take care of increasingly complex tasks that are currently reserved exclusively for humans – a clear trend that we can already observe today. A system that not only processes current data regarding stock markets, but that also follows and analyzes the development of political structures based on news texts or videos, extracts sentiments from texts in blogs or social networks, monitors and predicts relevant financial indicators, etc. requires the integration of many different subcomponents – getting these to interact and cooperate is the subject of current research, and new advances in this field are being published every week. In a world where AI systems are able to improve themselves continuously and, for example, manage companies more effectively than humans, what would be left for humans? Time for expanding one’s knowledge, improving society, eradicating hunger, eliminating diseases, and spreading our species beyond our own solar system.[40] Some theories say that quantum computers are required in order to develop powerful AI systems[41], and only a very careless person would suggest than an effective quantum computer will be available within the next 10 years. Then again, only a careless person would suggest that it will not. Regardless of this, and as history has taught us time and time again with the majority of relevant scientific accomplishments, caution will also have to be exercised when implementing artificial intelligence – systems capable of making an exponentially larger number of decisions in extremely short times as hardware performance improves can achieve many positive things, but they can also be misused.

Authors

Dr. Martin Hofmann

Dr. Martin Hofmann is Executive Vice President of the Volkswagen Group and Group CIO at Volkswagen AG. One of his most prominent initiatives has been the creation of Information Technology Labs in San Francisco, Berlin and Munich, which are considered leading in the field of data science, applied AI and Machine Learning in the automotive industry. He has a long history at the Volkswagen Group, joining in 2001, where he took charge of Group Procurement Process and Information Management.

Previously, he worked at the international IT service provider Electronic Data Systems Corporation (EDS) where he held several senior management positions and served as Executive Director Digital Supply Chain in the United States.

Dr. Hofmann graduated Harvard Business School AMP, has a PhD in engineering from the ETH Zurich and a degree in business computer science and business administration from the University of Mannheim.

Dr. Florian Neukart

Dr. Florian Neukart is Principal Data Scientist at Volkswagen Group of America. He is lecturer and scientist in the fields of quantum computers and artificial intelligence at University of Leiden. Furthermore, he is the author of “Reverse Engineering the Mind – Consciously Acting Machines and Accelerated Evolution”.

Prof. Dr. Thomas Bäck

Prof. Dr. Thomas Bäck is scientist for global optimization, predictive analytics and Industry 4.0. As an entrepreneur, he has more than 20 years experience in industrial applications with companies such as 3M, Air Liquide, BMW, Daimler, Unilever and Volkswagen. Furthermore, he is an author of more than 300 scientific publications, e.g. “Evolutionary Algorithms in Theory and Practice” and co-inventor of 4 patents.

Sources

[1] D. Silver et. al.: Mastering the Game of Go with Deep Neural Networks and Tree Search, Nature 529, 484-489 (January 28, 2016).

[2] https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining

[3] Systems “in which information and software components are connected to mechanical and electronic components and in which data is transferred and exchanged, and monitoring and control tasks are carried out, in real-time using infrastructures such as the Internet.” (Translation of the following article in Gabler Wirtschaftslexikon, Springer: http://wirtschaftslexikon.gabler.de/Definition/cyber-physische-systeme.html).

[4] Industry 4.0 is defined therein as “a marketing term that is also used in science communication and refers to a ‘future project’ of the German federal government. The so-called ‘Fourth Industrial Revolution’ is characterized by the customization and hybridization of products and the integration of customers and business partners into business processes.” (Translation of the following article in Gabler Wirtschaftslexikon, Springer: http://wirtschaftslexikon.gabler.de/Definition/industrie-4-0.html).

[5] E. Rich, K. Knight: Artificial Intelligence, 5, 1990

[6] Th. Bäck, D.B. Fogel, Z. Michalewicz: Handbook of Evolutionary Computation, Institute of Physics Publishing, New York, 1997.

[7] R. Bajcsy: Active perception, Proceedings of the IEEE, 76:996-1005, 1988

[8] J. L. Crowley, H. I. Christensen: Vision as a Process: Basic Research on Computer Vision Systems, Berlin: Springer, 1995

[9] D. P. Huttenlocher, S. Ulman: Recognizing Solid Objects by Alignment with an Image, International Journal of Computer Vision, 5: 195-212, 1990

[10] K. Frankish, W. M. Ramsey: The Cambridge Handbook of Artificial Intelligence, Cambridge: Cambridge University Press, 2014

[11] F. Chaumette, S. Hutchinson: Visual Servo Control I: Basic Approaches, IEEE Robotics and Automation Magazine, 13(4): 82-90, 2006

[12] E. D. Dickmanns: Dynamic Vision for Perception and Control of Motion, London: Springer, 2007

[13] T. M. Straat, M. A. Fischler: Context-Based Vision: Recognizing Objects Using Information from Both 2D and 3D Imagery, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13: 1050-65, 1991

[14] D. Hoiem, A. A. Efros, M. Hebert: Putting Objects in Perspective, Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2137-44, 2006

[15] H. Buxton: Learning and Understanding Dynamic Scene Activity: A Review, Vision Computing, 21: 125-36, 2003

[16] N. Lavarac, S. Dzeroski: Inductive Logic Programming, Vol. 3: Non-Monotonic Reasoning and Uncertain Reasoning, Oxford University Press: Oxford, 1994

[17] K. Frankish, W. M. Ramsey: The Cambridge Handbook of Artificial Intelligence, Cambridge: Cambridge University Press, 2014

[18] G. Leech, R. Garside, M. Bryant: . CLAWS4: The Tagging of the British National Corpus. In Proceedings of the 15th International Conference on Computational Linguistics (COLING 94) Kyoto, Japan, pp. 622-628, 1994

[19] K. Spärck Jones: Information Retrieval and Artificial Intelligence, Artificial Intelligence 141: 257-81, 1999

[20] A. Newell, H. A. Simon: Computer Science as Empirical Enquiry: Symbols and Search, Communications of the ACM 19:113-26

[21] M. Bratman, D. J. Israel, M. E. Pollack: Plans and Resource-Bounded Practical Reasoning, Computational Intelligence, 4: 156-72, 1988

[22] H. Ah. Bond, L. Gasser: Readings in Distributed Artificial Intelligence, San Mateo, CA: Morgan Kaufmann, 1988

[23] E. H. Durfee: Coordination for Distributed Problem Solvers, Boston, MA: Kluwer Academic, 1988

[24] G. Weiss: Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence, Cambridge, MA: MIT Press, 1999

[25] K. Frankish, W. M. Ramsey: The Cambridge Handbook of Artificial Intelligence, Cambridge: Cambridge University Press, 2014

[26] E. Alonso: Multi-Agent Learning, Special Issue of Autonomous Agents and Multi-Agent Systems 15(1), 2007

[27] E. Alonso: M. d’Inverno, D. Kudenko, M. Luck, J. Noble: Learning in Multi-Agent Systems, Knowledge Engineering Review 16: 277-84, 2001

[28] M. Veloso, P. Stone: Multiagent Systems: A Survey from a Machine Learning Perspective, Autonomous Robots 8: 345-83, 2000

[29] R. Vohra, M. Wellmann: Foundations of Multi-Agent Learning, Artificial Intelligence 171:363-4, 2007

[30] L. Busoniu, R. Babuska, B. De Schutter: A Comprehensive Survey of Multi-Agent Reinforcement Learning, IEEE Transactions on Systems, Man, and Cybernetics – Part C: Applications and Reviews 38: 156-72, 2008

[31] “Evolution strategies” are a variant of “evolutionary algorithms,” which has been developed in Germany. They offer significantly better performance than genetic algorithms for this type of task. See also Th. Bäck: Evolutionary Algorithms in Theory and Practice, Oxford University Press, NY, 1996.

[32] For example: Th. Bäck, C. Foussette, P. Krause: Automatische Metamodellierung von CAE-Simulationsmodellen (Automatic Meta-Modeling of CAE Simulation Models), ATZ – Automobiltechnische Zeitschrift 117(5), 64-69, 2015.

[33] For details: http://www.divis-gmbh.de/fileadmin/download/fallbeispiele/140311_Fallbeispiel_BMW_Machbarkeitsbewertung_Umformsimulation_DE.pdf

[34] This term is used in a great many ways and is actually very general in nature. We use it here in the sense of continuous monitoring of company KPIs that are automatically generated and analyzed on a weekly basis, for example.

[35] http://www.automotive-fleet.com/news/story/2003/02/gm-installs-nutechs-vehicle-distribution-system.aspx

[36] C. Sorg: Data Mining als Methode zur Industrialisierung und Qualifizierung neuer Fertigungsprozesse für CFK-Bauteile in automobiler Großserienproduktion (Data Mining as a Method for the Industrialization and Qualification of New Production Processes for CFRP Components in Large-Scale Automotive Production). Dissertation, Technical University of Munich, 2014. Dr. Hut Verlag.

[37] One example can be found in this article: http://www.enbis.org/activities/events/current/214_ENBIS_12_in_Ljubljana/programmeitem/1183_Ask_the_Right_Questions__or_Apply_Involved_Statistics__Thoughts_on_the_Analysis_of_Customer_Satisfaction_Data.

[38] http://www.syntragy.com/doc/q3-05%5B1%5D.pdf

[39] L. Gräning, B. Sendhoff: Shape Mining: A Holistic Data Mining Approach to Engineering Design. Advanced Engineering Informatics 28(2), 166-185, 2014.

[40] F. Neukart: Reverse-Engineering the Mind; see https://www.scribd.com/mobile/doc/195264056/Reverse-Engineering-the-Mind, 2014

[41] F. Neukart: On Quantum Computers and Artificial Neural Networks, Signal Processing Research 2(1), 2013

Posts

1, A brief review on back propagation of DCL.

2, Forward propagation of simple RNN

3, The steps of back propagation of simple RNN

1. Predictive Analytics

2. Improved Cybersecurity

3. Digital Workers

4. Hybrid Workforce

5. Process Intelligence

Conclusion

Premier organizations prefer Hadoop Developers with Java skills

How Can AI Help?

Repetitive, Menial Tasks

Reducing Misdiagnosis

Artificial Intelligence Assistants

Conclusion

Table of Contents

1 Introduction

2 The data mining process

3 The pillars of artificial intelligence

3.1 Machine learning

3.2 Computer vision

3.3 Inference and decision-making

3.4 Language and communication

3.5 Agents and actions

4 Data mining and artificial intelligence in the automotive industry

4.1 Development

4.2 Procurement

4.3 Logistics

4.4 Production

4.5 Marketing

4.6 Sales, after-sales, and retail

4.7 Connected customer

5 Vision

5.1 Vision – Vehicles as autonomous, adaptive, and social agents & cities as super-agents

5.2 Vision – integrated factory optimization

5.3 Vision – companies acting autonomously

6 Conclusions

Authors

Sources

Interesting links

Pages

Categories

Archive