Error handling and logging

Robust Data Loader

On a new branch (error_handling), create a file load_scores.py.

Sample data

Create a file scores.csv with the following content:

name,score
Alice,95
Bob,notanumber
Charlie,72
Eve,
Frank,88

Task

Start from this skeleton and add error handling:

#!/usr/bin/env python3


def load_scores(path):
    """Load (name, score) tuples from a CSV file, skipping bad lines."""
    results = []
    f = open(path)
    for line_num, line in enumerate(f, start=1):
        if line_num == 1:
            continue  # skip header
        name, score_str = line.strip().split(",")
        score = float(score_str)
        results.append((name, score))
    f.close()
    return results


if __name__ == "__main__":
    for path in ["scores.csv", "missing.csv"]:
        scores = load_scores(path)
        print(f"{path}: Loaded {len(scores)} valid scores: {scores}")

This code crashes on the sample data. Make it robust:

  1. Handle lines that cannot be parsed (e.g. non-numeric score, missing fields) — log each skipped line with logger.warning() and continue with the next line.
  2. Use a context manager (with) instead of manual open/close.
  3. In the __main__ loop, handle FileNotFoundError so that a missing file is logged with logger.exception() and the loop continues with the next file.
  4. After processing all files, exit with return code 1 if any file was missing. The program should still finish processing all files before exiting.
  5. After your changes, the program should log a warning for each bad line in scores.csv, return three valid entries, then log an error for missing.csv, carry on, and exit with code 1.

Requirements

  • Use the logging module with a named logger (not print).
  • The function should not crash on bad input.

Stretch goal

  • Add a strict parameter: when True, raise on the first bad line instead of skipping.

Push the branch with the changes, open a MR and add Florian Ziemen as reviewer.