PHD Discussions Logo

Ask, Learn and Accelerate in your PhD Research

Question Icon Post Your Answer

Question Icon

2 years ago in Machine Learning By Pradeep Kumar

What is a suitable ML algorithm for a dataset with 100,000 records and 10 features?

What type of machine learning algorithm would you recommend for a tabular dataset with 100,000 rows and 10 columns to ensure robust performance?

All Answers (1 Answers In All)

By Sumitra R Answered 10 months ago

For 100k records, efficient and powerful algorithms like Gradient Boosted Trees (XGBoost, LightGBM, CatBoost) or Random Forest are excellent choices. For linear problems, regularized linear models (Logistic/Linear Regression with L1/L2 penalty) are also suitable. Use Python (scikit-learn, XGBoost) or R as the programming platform.

Your Answer