PHD Discussions Logo

Ask, Learn and Accelerate in your PhD Research

Question Icon Post Your Answer

Question Icon

1 year ago in Machine Learning By Pradeep Kumar

What is a suitable ML algorithm for a dataset with 100,000 records and 10 features?

What type of machine learning algorithm would you recommend for a tabular dataset with 100,000 rows and 10 columns to ensure robust performance?

All Answers (1 Answers In All)

By Sumitra R Answered 8 months ago

For 100k records, efficient and powerful algorithms like Gradient Boosted Trees (XGBoost, LightGBM, CatBoost) or Random Forest are excellent choices. For linear problems, regularized linear models (Logistic/Linear Regression with L1/L2 penalty) are also suitable. Use Python (scikit-learn, XGBoost) or R as the programming platform.

Your Answer