Introduction to Machine Learning


Computer Science has been enlisted in the service of Intelligence since its inception and is a direct derivative of warfare in World War 2. Video games as a direct product of military simulations serve as a secondary market for technology developed initially for military purposes. Developing from the ideals of Alan Turing, Konrad Zuse, Heinz Billing and others during WWII, Artificial Intelligence has developed over the past several decades to do more autonomous computation not involving direct human manipulation of algorithms and informing both military simulations of combat, actual targeting in combat and in the public market video games.  An algorithm is basically a computer program, a set of code or libraries of code working toward computing a problem.  The basic starting block of Artificial Intelligence is based in statistics, and what is known as Statistical Learning.  Also, we have seen how what is considered cutting edge commercial technical advances in AI such as Generative Adversarial Networks (GANs) have been used in military defense technology for some 20 years before being ‘leaked’ to the commercial world. Yet, we see in the commercial application of AI severe ethical and technological problems are constantly at the forefront of the discussion of AI, yet, there is no guarantee in the covert world with limited dialogue and availability to others research that such cutting edge developments were done with due regard to code review and testing before being put into production to meet deadlines of lucrative contracts in the often black budgeted world of Defense contractors. Given that this is a technical question it is good to have at least some technical background knowledge of some of the challenges involved in developing Automation and AI.  In the following an overview of Machine Learning is presented.  

First, we need to understand what we say when we are talking about Machine Learning:

A computer program is said to learn from experience E with respect to some class of Tasks T and performance measure P if its performance at Tasks T, as measured by P, improves with experience E. –Tom Mitchell

Machine Learning is based in statistics (similar to statistical mechanics in physics), the early work in Artificial Intelligence was done in Statistical Learning. 

Statistical Learning – a set of approaches for estimating f, f is estimated for prediction and inference, how to estimate f: parametric (parameterization) vs. non-parametric (featureless) 

Statistical learning is a field of study that deals with the development and application of statistical models to understand and make predictions or decisions from data. It encompasses a wide range of techniques, including supervised learning (such as regression and classification), unsupervised learning (such as clustering and dimensionality reduction), and reinforcement learning. These methods are used in a variety of applications, including prediction, inference, and decision making, and they are often used in combination with one another to solve complex problems. Some popular statistical learning methods include linear regression, decision trees, and neural networks. Machine learning is a field of study that uses statistical techniques to give computer systems the ability to “learn” (i.e. progressively improve performance on a specific task) with data, without being explicitly programmed. Machine learning is a type of artificial intelligence that allows systems to automatically learn and improve from experience, rather than being explicitly programmed.

There are several programming languages commonly used in the field of machine learning, including:

Python:  is the most widely used programming language in the field of machine learning. It has a large number of libraries and frameworks such as TensorFlow, PyTorch, and scikit-learn, which make it easier to build and train machine learning models.

R:  is a statistical programming language that has a large number of packages for data analysis and machine learning. It’s widely used for developing statistical models and is popular among data scientists.

Java:  is an object-oriented programming language that is well suited for developing large-scale applications. It has libraries such as Weka and Deeplearning4j for building machine learning models.

Julia:  is a relatively new programming language that’s gaining popularity in the field of machine learning. It’s designed to be fast and easy to use, and it has a growing number of packages for machine learning, including Flux.jl and MLJ.jl.

C++:  is a low-level programming language that is often used for developing high-performance machine learning algorithms. It has libraries such as TensorFlow, Caffe, and Torch for building and training machine learning models.

In this book I focus on Python for Machine Learning examples and C# for game development, a Github repo of the code is at https://github.com/autonomous019/play_ai (download code as needed).  In this section we will be focusing on Python.  In Python, you can use libraries such as scikit-learn for traditional machine learning tasks, like regression, classification, and clustering. For deep learning, TensorFlow and PyTorch are popular choices. These libraries provide a high-level API for building and training neural networks, and they handle the low-level implementation details for you.  An important component of Python development is the use of Pip for installing modules and various dependencies for different machine learning tasks. For instance to install matplotlib so it can be imported into your notebooks you would run in the command line or in the notebook itself:

pip install matplotlib #run on command line

!pip install matplotlib #if running in the notebook itself

For developing Machine Learning with Python notebooks are used.  Recently, Google Collaboratory (https://colab.research.google.com/) has become a popular place to develop using notebooks, although you can also install Anaconda and run Jupyter Notebooks (https://jupyter.org/), another on-line resource for learning and coding Python for Machine Learning, is Kaggle.com, there you can browse open-source ML code notebooks and implement them on your own.  Jupyter notebooks are a popular tool in the Python programming community, especially in the field of data science and machine learning. A Jupyter notebook is an interactive computing environment that allows you to mix code, text, and visualizations in one document. It is an excellent tool for experimenting with code, testing ideas, and documenting your work.

    In a Jupyter notebook, you can write and run code snippets, add markdown cells for documentation, and display visualizations and outputs. The notebook allows you to see the results of your code immediately, making it easier to understand and debug your code. Jupyter notebooks also make it easy to share your work with others, as they can be easily exported to other formats such as HTML, PDF, and LaTeX.

To use a Jupyter notebook, you need to have Python installed on your computer, along with the Jupyter package. Once you have installed Jupyter, you can start a new notebook by running the command jupyter notebook in your terminal. This will launch a web-based interface where you can create and manage notebooks.

    In a Jupyter notebook, you can write code in cells and run them by pressing Shift + Enter or by clicking the Run button in the toolbar. This allows you to see the results of your code immediately, making it an excellent tool for interactive programming and exploration.

    Whether you are a beginner or an experienced Python programmer, Jupyter notebooks are a valuable tool to have in your toolkit. They make it easy to experiment with code, test ideas, and document your work, all in one place.

    Jupyter notebooks are included in a programming package for Python known as Anaconda. Anaconda provides a distribution of Python and over 720 open-source packages that are pre-installed and ready to use, making it easy to get started with your project. Let’s go over the steps to install Anaconda on your computer.

Setting up a Development Environment for Machine Learning

Step 1: Download the Anaconda Installer

The first step to installing Anaconda is to download the installer. You can do this by visiting the Anaconda website (https://www.anaconda.com/products/distribution) and clicking on the “Download” button. Make sure to select the version of Anaconda that is compatible with your operating system.

Step 2: Install Anaconda

Once you have downloaded the installer, you can start the installation process. On Windows, double-click on the downloaded .exe file and follow the on-screen instructions. On Windows, click on the downloaded executable file (.exe) and follow the installation instructions. On MacOS and Linux, open the terminal and run the following command:

bash Anaconda-latest-MacOSX-x86_64.sh

Replace Anaconda-latest-MacOSX-x86_64.sh with the name of the Anaconda installer you downloaded. Follow the on-screen instructions to complete the installation.

Step 3: Verify the Installation

To verify that Anaconda is installed, you can open a terminal or command prompt and run the following command:

        conda –version

This should display the version number of Anaconda that you have installed.

Step 4: Create a Virtual Environment

Anaconda allows you to create virtual environments for your projects, which are isolated environments that have their own set of packages and dependencies. To create a virtual environment, run the following command:

conda create –name myenv

Replace myenv with the name of your virtual environment. This will create a new virtual environment with the name you specified.

Step 5: Activate the Virtual Environment

To activate the virtual environment you created, run the following command:

conda activate myenv

Replace myenv with the name of your virtual environment. This will change the active environment to the one you specified, and you will see the name of the virtual environment in the terminal prompt.

Step 6: Install Packages

Now that you have Anaconda installed and a virtual environment set up, you can start installing packages that you need for your project. To install a package, run the following command:

conda install package_name

Replace package_name with the name of the package you want to install. Anaconda comes with over 720 pre-installed packages, so you can start using them right away.

Once you have installed Anaconda and are up and running Jupyter notebooks or use Google Colab, you are ready to start importing such libraries as Scikit Learn, Pandas and Matplotlib.  To work with large sets of data, programmers have come up with Dataframes to hold data and to manipulate the data, Pandas is ideal for this purpose.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *