DATA INTELLIGENCE HUB - For business customers

Erleben, was verbindet.

How to improve the Training Time of Neural Networks

Neural Networks are an exciting technology used in the field of machine learning these days. They give many great insights and can achieve better accuracy in tasks like image detection than humans. In this article a phenomenon that many big networks struggle with is explained: Training Time.
Reading about Neural Networks you often hear, that the bigger the better and fair enough, big networks like GoogLeNet or VGG16 bring astonishing results. But starting to train them, a quick realization of the huge amount of time needed kicks in. Without strong GPUs there is a big chance you sit in front of your computer for days until there are any results.
To get a little more insight a was tested for its training time regarding some important parameters. Starting with the implementation, there is always the question on “How Deep?” and “How Big?” should the network be. So, the test began with the comparison of the number of layers:

Number of hidden layers
Training Time
1
0:01:21.922
5
0:03:27.812
10
0:07:17.266
20
0:12:55.532

 

As can be quickly seen, the training time increases naturally with the number of layers. The training time doesn’t increase linearly or exponentially, but still by quite a bit. And that for sure should be considered if you think about longer training times. The difference of three and a half minutes and seven minutes does not seem to big. But scaling it up to two days and four days you can easily see the value of really thinking about how big a network really must be and how to cut some time.
Coming to the second question of “How Big?” a network should be, a similar point is reached:

Multiplicator of Neurons Training Time
0.5 0:02:29.468
1 0:03:27.812
3 0:07:58.562
5 0:12:09.969
10 0:25:58.688

 

Again, it becomes clear, what difference the layer size makes. It increases more and more and if you scale it up to huge networks the time difference can be quickly in the day range. And then imagine having to and train it several times. You can forget about fast decision making at this point. Another important parameter that hugely effects the training time is the . The smaller the batches the better a network ed, but how much longer is the training time?

Batch Size Training Time
20 0:12:38.844
200 0:03:27.812
2000 0:02:13.359

 

Having a batch size of 20 instead of 200 increases the time about 250%. Again, quick insights of your newly acquired data are far away. So many aspects to keep in mind, but how do you solve the problem of growing training time?
To get good insights there is sometimes no other solution than just having a big comprehensive Neural Network. A thought on how many layers and neurons used should still be spent. Maybe a layer less brings still similar results but could decrease the training time.
But there is even a better way to cut the Training Time: Computational Power.
All of the above tests were done on a simple computer.

But the digital age is here and also the age of cloud computing. That is where we as the Data Intelligence Hub can come in place. A local computer often has 4-8 cores to use and a limited amount of computing power (RAM). Our cloud environment supports multiple high-power CPU cores, in the future also GPU cores for effective training of big and deep Neural Networks. In our of the DIH it is possible to access as many cores and computing power as wanted on demand. So instead of your 8 cores and 32 GB RAM you will be now able to choose between tens or if needed even hundreds of cores and in the hundreds of RAM all as much and as long as you want. And scaling your environment up is mostly just clicking some buttons or moving a bar nowadays. At the end, this will help make your predictions feasible.

So if you want to accelerate your data analytics insights get your premium access to the wide computing power of the Data Intelligence Hub and start your Neural Network training now!