Semiconductor AI/ML technologies

Advantages of decision-tree models

Research demonstrates decision trees outperform deep learning with tabular data.

by SmartFactory Automation Solution Experts Team

In our recent customer use case, “Achieve accurate lot cycle time predictions for more on-time deliveries,” we noted we determined the most appropriate ML model for the customer’s needs was a gradient boosted tree-based machine learning model, particularly the Light Gradient Boosted Machine implementation. This decision is supported by recent research conducted by Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux at Inria Saclay Centre and Sorbonne University, whose work concluded decision trees outperform deep learning on medium-size tabular data.

Noting that deep learning has “enabled tremendous progress on text and image datasets,”¹ researchers stated it had not been proven to be superior at processing these datasets. To compare the performance of the models, they collected 45 tabular datasets, each comprised of more than 3,000 real-world examples. They then trained standard and novel deep learning methods such as vanilla neural network, ResNet, and two Transformer-based models, as well as tree-based models including XGBoost, gradient boosting machines and Random Forests, among others. Each model was trained 400 times, searching randomly through a predefined hyperparameter space.

In assessing the models’ performance, the best tree-based models performed 20 to 30 percent better than the best deep learning models, when averaged across all tasks. They also found neural networks to be much more susceptible to random or less important data features than decision trees. When the authors removed uninformative features, the performance of the two models was more similar. When adding random features to the datasets, the neural networks showed a sharp decline.

The authors concluded, “Results show that tree-based models remain state-of-the-art on medium-sized data (∼10K samples) even without accounting for their superior speed.”

REFERENCE

1. Grinsztajn, L., Oyallon, E., Varoquaux, G. Why do tree-based models still outperform deep learning on tabular data? NeurIPS22 Datasets and Benchmarks Track, Nov 22, New Orleans, United States. hal-03723551v2
https://hal.archives-ouvertes.fr/hal-03723551v2

About the Author

SmartFactory Automation Solution Experts Team

The Applied SmartFactory® automation solution experts team develops integrated automation solutions for semiconductor manufacturers to improve the performance of factories. From implementing a manufacturing execution system that advances collaboration and automation, to integrating AI/ML technologies for faster decision making, SmartFactory automation solutions enable manufacturers to prioritize quality and reliability across every stage of the manufacturing process.

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.

Necessary

Always Enabled

Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
_icl_visitor_lang_js	1 day	This cookie is stored by WPML WordPress plugin. The purpose of the cookie is to store the redirected language.
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category .
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Necessary" category .
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Others".
wpml_browser_redirect_test	session	This cookie is set by WPML WordPress plugin and is used to test if cookies are enabled on the browser.

Analytics

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_VWW0QD4ZN6	2 years	This cookie is installed by Google Analytics.
_gali	30 seconds	This cookie is associated with Google Analytics. This cookie is used to collect information about how visitors use our site.
_gat_gtag_UA_202539731_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Others

Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.

Cookie	Duration	Description
loglevel	Persistent	Maintains settings and outputs when using the Developer Tools Console on current session.

Advantages of decision-tree models

by SmartFactory Automation Solution Experts Team

About the Author

Are you ready to advance your factory automation with SmartFactory automation solutions?

Main Links

Support

Legal

Semiconductor categories

Pharmaceutical categories

Semiconductor Solutions

Pharmaceutical Solutions

Automation Software

Follow SmartFactory