In Python machine learning, how do I batch process data?

In Python machine learning, batch processing of data is an efficient way to manage and train models on large datasets. This technique involves splitting the dataset into smaller, manageable portions (batches), which can improve performance and reduce memory usage during training.


            # Import necessary libraries
            import numpy as np
            from sklearn.model_selection import train_test_split
            from sklearn.preprocessing import StandardScaler
            from sklearn.linear_model import LogisticRegression

            # Sample dataset generation
            data = np.random.rand(1000, 10)  # 1000 samples, 10 features
            labels = np.random.randint(0, 2, size=(1000,))  # Binary labels

            # Splitting the dataset into training and testing sets
            X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2)

            # Standardizing the features
            scaler = StandardScaler()
            X_train = scaler.fit_transform(X_train)
            X_test = scaler.transform(X_test)

            # Batch processing
            batch_size = 32
            model = LogisticRegression()

            for i in range(0, len(X_train), batch_size):
                X_batch = X_train[i:i + batch_size]
                y_batch = y_train[i:i + batch_size]
                model.fit(X_batch, y_batch)

            # Model evaluation
            accuracy = model.score(X_test, y_test)
            print("Accuracy:", accuracy)

In Python machine learning, how do I batch process data?

Popular Topics

Recent Languages

In Python machine learning, how do I batch process data?

Related Questions

Popular Topics

Recent Languages