Everyone who is new to Machine Learning, Data Science, Deep Learning, or Artificial Intelligence, in general, has problems with choosing the best environment setup that will get them going with their projects.
In this article, we are going to walk you through our setup. This setup is what we’ve used in university so it highly influenced us after graduation and in our professional lives.
Keep in mind, that this is good for us, for you it might be something different, this is here just as a suggestion.
The best environment setup for Machine Learning ever:
– Programming Languages
- Python: this is our number one choice simply because it is the best language for Machine Learning, Deep Learning, and Data Science. It is easy to understand, has tons of libraries, and huge support from the community. We use a 3.6.9 version, it works fine with all the libraries we need and it is suitable for Django 2.1 for our web application projects.
- R: this is our number one choice whenever we work something that is pure statistics and time-series analyses. It has tons of libraries, it is easy to understand and it helped us a lot when we were learning Statistics at university. The ability to visualize the results (which is easy in this language) will help you understand how data is related.
- PyCharm: this is our IDE for our Python projects. It is dead easy to use, has tons of support if through its interface if you don’t want to use the terminal. It will allow you to create virtual environments and install the needed libraries in no time. There are different editions of it, but you will be completely fine with the Community edition since you can use Anaconda for most of the libraries.
- R Studio: this is our IDE for R projects. It is well built, easy to understand, easy to use, and install everything you need for your project. It is free to use.
- NumPy: this is one of the essential libraries when you work with Python Projects. It allows you high dimensional arrays manipulations like arithmetic operations, conversions, etc. It will help you organize your data for your models.
- Matplotlib: we use this library for a visual representation of the results. It can do tons of stuff with your data, like plot functions, bar charts, etc. We use it a lot in Computer Vision if we need to plot a couple of images to make some comparisons.
- Pandas: this library is very nice to use when you need to organize your data and make it easier to use and understand. It organizes the data in structures called data frames and those will help you a lot with a clear representation of your data as well as customization of it.
- SciPy: this is an open-source Python library for scientific and technical computing. The main packages include modules for: optimization, linear algebra, integration, FFT, signal, and image processing.
- Scikit-learn: this is an open-source Python library that features many classifications, regression, and clustering algorithms like support vector machines(SVMs), random forests, gradient boosting, k-means, and DBSCAN. We use it when we create and train our Machine Learning models.
- OpenCV: this is one of the best Computer Vision and Image Processing libraries out there. It is very powerful and you will find it useful in most of the cases regarding Computer Vision. You can do tons of stuff on images and videos. We have a couple of Computer Vision articles that you can read.
- Tensorflow + Keras: whenever we do something that is Deep Learning related we use Tensorflow. Keras is a high-level API for Tensorflow that we like to use because it allows creating Neural Networks in the easiest and most understandable way possible. It has tons of stuff like layers, optimizers, loss functions that will help you in creating the best possible models. You can find our Deep Learning articles and read more.
- Datasets: this is an R package that contains the datasets that we use when we want to try our models out, or just to find how data is related, find trust intervals, find the distribution, etc.
- Graphics: this is the graphics package in R. This will help you visualize your data, similar to Matplotlib in Python.
- Cluster: this package is useful for cluster analysis. We are using Python for most of the projects requiring cluster analyses so we rarely use this library.
- Matrix: this package contains all the Matrices types and methods you are going need when you are doing high dimensional calculations in R.
- Stats: this here is the statistical package in R. You will use it a lot, it contains the classes of different statistical features of the data.
– Packages management
- Anaconda: for us, there is no better package management than this. It will help you install everything you need related to Python or R. In our experience it helps us avoid all of the problems and errors when installing a package or library in your system’s R or Python distribution. You can create different environments and use them across different projects. You can use command prompt to install packages or use the Anaconda Navigator for graphical interface installations.
- MacBook Pro 13″: Processor 2.3 GHz Quad-Core Intel Core i5; Memory 8 GB 2133 MHz LPDDR3; Graphics Intel Iris Plus Graphics 655 1536 MB
- Dell Inspiron 15 3000: Processor 1.99 GHz Intel i7-8550U; Memory 8 GB; Graphics Intel UHD Graphics 620 (integrated) and AMD Radeon 520
- Dell Inspiron 15 3000: Processor 2.4 GHz Intel i7-5500U; Memory 8 GB; Graphics Intel HD Graphics 5500 (integrated) and NVIDIA GeForce 840M
If you can afford, these are top 3 GPU, CPU and Monitors:
- NVIDIA Titan RTX Graphics Card
- EVGA GeForce RTX 2080 Super XC Gaming, 08G-P4-3182-KR, 8GB GDDR6
- ASUS Dual NVIDIA GeForce RTX 2070 Mini OC Edition Gaming Graphics Card
- AMD Ryzen 5 2600X Processor with Wraith Spire Cooler
- AMD Ryzen 9 3900X 12-core
- Intel Core i9-9900K Desktop Processor 8 Cores up to 5.0 GHz Turbo
- LG 27UK650-W 27 Inch 4K
- Dell Ultrasharp U2718Q 27-Inch 4K IPS Monitor
- Dell UltraSharp U3415W 34-Inch Curved
If you still can’t afford any of these, don’t worry, as you can see we don’t have them either yet, but we just mentioned for someone that wants to know which are the best. Also, this will be our choice for the next setup environment.
So, this is our setup. As you can see, you don’t need much to start with Machine Learning. If you still cannot buy a good PC or laptop you can use other resources like Google Colab, which allows you to run your models on their platform for free up to 12 hours per day (if you are not doing something complex you are not going to need 12 hours) or 24 hours per day for $9.99 per month.
There are different types of setups that will help you create and run the models, but the most important is the will to start and to learn something new.
If you are new to Artificial Intelligence and Machine Learning than our platform is the place to start. We are working in a laconic way, as the name suggests.
We are trying to create knowledge through practical examples and projects using programming (mainly Python) and transfer that knowledge in an understandable format for everyone, for FREE, FOREVER.
Like with every post we do, we encourage you to continue learning, trying, and creating.