# Baseline Logistic Regression Model

Before you can measure the impact of a label flipping attack, you need something to compare against. That's the baseline — a clean model trained on uncorrupted data. Every attack evaluation starts here. Without a reliable baseline, you can't tell how much damage the attack caused, whether your defenses worked, or whether the numbers you're seeing are signal or noise.

This post walks through building a logistic regression classifier as a baseline for label flipping attack experiments. The choice of logistic regression is deliberate: it's interpretable, fast to train, sensitive to label corruption in predictable ways, and gives clean before/after comparisons that are easy to reason about.

***

### The Dataset

For this example, we'll use a binary spam classification task. The dataset contains email features (word frequencies, metadata) and binary labels: 0 for legitimate email, 1 for spam.

```python
import numpy as np
import pandas as pd
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Using the SpamBase dataset — 4,601 emails, 57 features, binary labels
spam = fetch_openml('spambase', version=1, as_frame=True)
X = spam.data.astype(float).values
y = spam.target.astype(int).values  # 0 = ham, 1 = spam

# Train/test split — 80/20, stratified to preserve class balance
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training set: {X_train.shape[0]} examples")
print(f"Test set: {X_test.shape[0]} examples")
print(f"Class balance (train): {np.bincount(y_train)}")
# Training set: 3,680 examples
# Test set: 921 examples
# Class balance (train): [2137 ham, 1543 spam]
```

Checking class balance matters here. A heavily imbalanced dataset will mask the impact of targeted label flipping if you only look at accuracy.

***

### Feature Scaling

Logistic regression is sensitive to feature scale. Word frequency features in the SpamBase dataset vary widely in range. Standardize before training.

```python
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)  # fit on train only — no leakage
```

The scaler is fit on the training set and applied to the test set. Fitting on the full dataset would leak test distribution information into the model — a common mistake that inflates baseline numbers.

***

### Training the Baseline Model

```python
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, 
    f1_score, confusion_matrix, classification_report
)

# Train baseline on clean labels
baseline_model = LogisticRegression(
    max_iter=1000,
    random_state=42,
    C=1.0  # regularization strength — inverse of lambda
)

baseline_model.fit(X_train_scaled, y_train)
```

`max_iter=1000` prevents convergence warnings on this dataset. `C=1.0` is the default regularization — we'll keep it consistent across baseline and poisoned experiments so the comparison is fair.

***

### Evaluating the Baseline

Don't just look at accuracy. For a poisoning attack study, per-class metrics matter more — targeted attacks affect specific classes while leaving overall accuracy largely intact.

```python
y_pred_baseline = baseline_model.predict(X_test_scaled)

print("=== BASELINE MODEL PERFORMANCE ===")
print(f"Accuracy:  {accuracy_score(y_test, y_pred_baseline):.4f}")
print(f"Precision: {precision_score(y_test, y_pred_baseline):.4f}")
print(f"Recall:    {recall_score(y_test, y_pred_baseline):.4f}")
print(f"F1 Score:  {f1_score(y_test, y_pred_baseline):.4f}")
print()
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred_baseline))
print()
print(classification_report(y_test, y_pred_baseline, target_names=['Ham', 'Spam']))
```

Expected baseline output:

```
=== BASELINE MODEL PERFORMANCE ===
Accuracy:  0.9251
Precision: 0.9198
Recall:    0.9104
F1 Score:  0.9151

Confusion Matrix:
[[499  35]
 [ 34 353]]

              precision    recall  f1-score   support
         Ham       0.94      0.93      0.93       534
        Spam       0.91      0.91      0.91       387
    accuracy                           0.93       921
```

These are your reference numbers. Every poisoning experiment will be compared against them. Record them clearly — they're the ground truth for "what this model does when the data is clean."

***

### Understanding What the Baseline Tells You

A few things worth noting about these numbers before proceeding to attack experiments:

**92.5% accuracy** is good but not the whole story. The confusion matrix shows 35 false negatives (spam classified as ham) and 34 false positives (ham classified as spam). Targeted label flipping will specifically inflate one of these numbers while leaving the other largely unchanged.

**Per-class recall** tells you how well the model finds each class. Spam recall at 91% means the model catches 91% of spam in the test set. If a targeted attack flips spam labels during training, that number will drop — and overall accuracy might drop only slightly, masking the attack.

**This is the trap.** Looking only at overall accuracy, a 5-percentage-point drop in spam recall might appear as a 1-2 percentage point drop in overall accuracy. Looks fine. Isn't fine.

***

### Saving the Baseline for Comparison

```python
import joblib

# Save model and scaler for reuse in attack experiments
joblib.dump(baseline_model, 'baseline_model.pkl')
joblib.dump(scaler, 'baseline_scaler.pkl')

# Save baseline metrics as reference
baseline_metrics = {
    'accuracy': accuracy_score(y_test, y_pred_baseline),
    'precision': precision_score(y_test, y_pred_baseline),
    'recall': recall_score(y_test, y_pred_baseline),
    'f1': f1_score(y_test, y_pred_baseline),
    'confusion_matrix': confusion_matrix(y_test, y_pred_baseline).tolist()
}

import json
with open('baseline_metrics.json', 'w') as f:
    json.dump(baseline_metrics, f, indent=2)

print("Baseline saved. Ready for attack experiments.")
```

This setup — consistent train/test split, same scaler, same hyperparameters — is what makes the attack comparisons valid. If you change the experimental setup between baseline and poisoned experiments, the delta you're measuring is noise, not attack impact.

***

### What Comes Next

The baseline gives you the clean-data performance ceiling. The next step is introducing label corruption — first randomly, then targeted — and measuring exactly how much each attack degrades the model, on which metrics, and whether the degradation is detectable through standard evaluation alone.

The baseline isn't exciting. But every meaningful measurement in adversarial ML starts with a reliable one.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://netsecbandit.gitbook.io/docs/ai-attacks/ai-data-attacks/label-attacks/baseline-logistic-regression-model.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
