The Thinking Machine — (Part II)

mariomahecha0098
May 27
5 min read

Mario Mahecha, Santiago Guzman, Santiago Aristizabal

How artificial intelligence moved from rules to learning

In the first part of this series, we explored how humanity has repeatedly feared new thinking tools: writing, printing, calculators, and the internet. Each technology changed what humans needed to remember, calculate, or search. Artificial intelligence now raises a different question: what happens when a tool begins to participate in reasoning itself?

To understand AI today, we need to see it as a historical process. ChatGPT, image generators, and AI agents did not appear suddenly.

1. Symbolic AI: Intelligence as Rules

In 1950, Alan Turing published “Computing Machinery and Intelligence,” asking whether machines could think. Instead of defining thinking directly, he proposed what later became known as the Turing Test: could a machine imitate human conversation well enough to be mistaken for a person?

In 1956, the Dartmouth Summer Research Project on Artificial Intelligence formally launched AI as a field. The proposal was organized by John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon, and it helped popularize the term “artificial intelligence.”

Early AI was mostly symbolic AI. Researchers believed intelligence could be represented through logic, symbols, and rules.

A machine could follow instructions such as:

If condition A is present, and condition B is present, then conclusion C is likely.

This worked for structured problems, but the real world was too messy. Human reasoning depends on ambiguity, uncertainty, and context. Not everything can be reduced to a clean rule.

Key idea: Early AI tried to make machines intelligent by writing rules for them.

2. Expert Systems: Encoding Human Expertise

By the 1970s and 1980s, AI moved toward practical systems designed for specific expert tasks.

These were called expert systems.

Instead of trying to create a generally intelligent machine, researchers built programs that could imitate experts in narrow domains. One famous example was MYCIN, developed at Stanford in the 1970s, which used rule-based reasoning to help identify bacteria and recommend antibiotics.

Expert systems were important because they showed that AI could be useful in real-world decision-making.

But they also exposed a major limitation:

Human expertise is difficult to fully write down.

Experts often rely on experience, intuition, and pattern recognition. A rule-based system could apply knowledge, but it could not easily adapt when reality became more complex than the rules.

Key idea: Expert systems tried to capture human expertise, but they showed that expertise is more than a list of rules.

3. Machine Learning: From Rules to Data

The next major shift was machine learning. Instead of programming every rule manually, researchers began training machines with examples.

This idea was already present early in AI history. In 1959, Arthur Samuel described machine learning through his checkers-playing program, showing that a computer could improve through experience. But machine learning became much more powerful later, when digital data and computing power expanded.

In symbolic AI, humans wrote the rules. In machine learning, machines learned patterns from data.

For example, instead of writing every rule to detect spam emails, developers could show a model thousands or millions of spam and non-spam emails. The model would learn the patterns itself.

This changed the meaning of AI:

Machines stopped being only programmed. They started being trained.

Key idea: Machine learning shifted AI from human-written rules to data-driven pattern recognition.

4. Deep Learning: Learning Complex Patterns

Traditional machine learning could learn patterns from data, but it often depended on humans to first define which features were important. For example, engineers might manually tell the system to analyze edges, shapes, textures, or specific measurements before the algorithm could make predictions. Deep learning changed this approach.

Instead of relying heavily on handcrafted features, deep learning uses neural networks organized in many layers that can automatically learn representations directly from raw data. A layer is a stage of processing in which the system transforms information into a more complex representation. Early layers may detect simple patterns such as edges or contrasts. Intermediate layers may recognize shapes or textures. Deeper layers may identify complex objects, faces, organs, tumors, or highly abstract patterns.

In 2012, AlexNet, developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, won the ImageNet competition and showed the power of deep neural networks for image recognition.

In 2015, Yann LeCun, Yoshua Bengio, and Geoffrey Hinton published a major review in Nature summarizing how deep learning had improved speech recognition, visual object recognition, and other fields.

Deep learning became possible because several forces came together:

massive digital datasets
stronger computing power
graphics processing units
improved neural network architectures

The internet created enormous amounts of data. Deep learning learned from that data.

Key idea: Deep learning allowed machines to learn complex patterns directly from large datasets.

5. Generative AI: Machines Begin Producing

For many years, AI mostly classified, detected, ranked, or predicted. Then came generative AI.

Generative AI produces new content: text, images, code, audio, summaries, explanations, and conversations.

Unlike earlier AI systems that mainly analyzed existing information, generative AI can create entirely new content by learning statistical patterns from massive datasets.

A key milestone was the Transformer architecture, introduced in 2017 in the paper “Attention Is All You Need” by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Łukasz Kaiser, and Illia Polosukhin. The Transformer made it possible to process language more efficiently using attention mechanisms and became the foundation of many modern language models.

In 2020, OpenAI introduced GPT-3, a 175-billion-parameter language model that showed strong few-shot learning capabilities, meaning it could perform many language tasks from only a few examples or instructions.

In 2022, ChatGPT brought generative AI into everyday public use. For the first time, millions of people could interact directly with AI using ordinary language.

This changed the relationship between humans and machines.

The internet made information searchable. Generative AI made information conversational.

Key idea: Generative AI changed AI from a mostly analytic tool into a conversational system capable of producing language, images, code, audio, and explanations.

References:

Turing AM. Computing machinery and intelligence. Mind. 1950;59(236):433-460.

McCarthy J, Minsky ML, Rochester N, Shannon CE. A proposal for the Dartmouth Summer Research Project on Artificial Intelligence. 1955.

Buchanan BG, Shortliffe EH, eds. Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison-Wesley; 1984.

Samuel AL. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development. 1959;3(3):210-229.

Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems. 2012;25.

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436-444.

Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Advances in Neural Information Processing Systems. 2017;30.

Bommasani R, Hudson DA, Adeli E, et al. On the opportunities and risks of foundation models. Stanford Center for Research on Foundation Models; 2021.