A new research paper from the University of Massachusetts, Amherst, looked at the carbon dioxide (CO2) generated over the course of training several common large artificial intelligence (AI) models and found that the process can generate nearly five times the amount as an average American car over its lifetime plus the process of making the car itself.\nThe paper specifically examined the model training process for natural-language processing (NLP), which is how AI handles natural language interactions. The study found that during the training process, more than 626,000 pounds of carbon dioxide is generated.\nNow see how AI can boost data-center availability and efficiency\nThis is significant, since AI training is one IT process that has remained firmly on-premises and not moved to the cloud. Very expensive equipment is needed, as are large volumes of data, so the cloud isn\u2019t right for most AI training, and the report notes this. Plus, IT shops want to keep that kind of IP in-house. So, if you are experimenting with AI, that power bill is going to go up.\nWhile the report used carbon dioxide as a measure, that\u2019s still the product of electricity generation. Training involves the use of the most powerful processors, typically Nvidia GPUs, and they are not known for being low-power draws. And as the paper notes, \u201cModel training also incurs a substantial cost to the environment due to the energy required to power this hardware for weeks or months at a time.\u201d\nTraining is the most processor-intensive portion of AI. That means power-hungry Nvidia GPUs running at full utilization for the entire time. In this case, it's trying to understand how to handle and process natural-language questions rather than broken sentences of keywords like your typical Google search.\nThe report said training one model with a neural architecture generated 626,155 pounds of CO2. By contrast, one passenger flying round trip between New York and San Francisco would generate 1,984 pounds of CO2, an average American would generate 11,023 pounds in one year, and a car would generate 126,000 pounds over the course of its lifetime.\nHow the researchers calculated the CO2 amounts\nThe researchers used four models in the NLP field that have been responsible for the biggest leaps in performance. They are Transformer, ELMo, BERT, and GPT-2. They trained all of the models on a single Nvidia Titan X GPU, with the exception of ELMo which was trained on three Nvidia GTX 1080 Ti GPUs. Each model was trained for a maximum of one day.\nThey then used the number of training hours listed in the model\u2019s original papers to calculate the total energy consumed over the complete training process. That number was converted into pounds of carbon-dioxide equivalent based on the average energy mix in the U.S.\nThe big takeaway is that computational costs start out relatively inexpensive, but they mushroom when additional tuning steps were used to increase the model\u2019s final accuracy. A tuning process known as neural architecture search (NAS) is the worst offender because it does so much processing. NAS is an algorithm that searches for the best neural network architecture. It is seriously advanced AI and requires the most processing time and power.\nThe researchers suggest it would be beneficial to directly compare different models to perform a cost-benefit (accuracy) analysis.\n\u201cTo address this, when proposing a model that is meant to be re-trained for downstream use, such as re-training on a new domain or fine-tuning on a new task, authors should report training time and computational resources required, as well as model sensitivity to hyperparameters. This will enable direct comparison across models, allowing subsequent consumers of these models to accurately assess whether the required computational resources,\u201d the authors wrote.\nThey also say researchers who are cost-constrained should pool resources and avoid the cloud, as cloud compute time is more expensive. In an example, they said a GPU server with eight Nvidia 1080 Ti GPUs and supporting hardware is available for approximately $20,000. To develop the sample models used in their study, that hardware would cost $145,000, plus electricity to run the models, about half the estimated cost to use on-demand cloud GPUs.\u00a0\n\u201cUnlike money spent on cloud compute, however, that invested in centralized resources would continue to pay off as resources are shared across many projects. A government-funded academic compute cloud would provide equitable access to all researchers,\u201d they wrote.