Summary of "Handling Date and Time Variables | Day 34 | 100 Days of Machine Learning"
Handling date & time variables (Day 34, 100 Days of ML)
Main idea
Date and time columns are highly informative. Convert columns stored as strings/objects to pandas datetime (pd.to_datetime) so you can extract many useful features (year, month, day, weekday, quarter, time parts, differences, etc.). The original video is a code-focused walkthrough demonstrating how to convert columns, extract features, and compute differences; the accompanying notebook is intended as a future reference.
Key concepts & lessons
- Always convert string/object date/time columns to pandas datetime using
pd.to_datetimebefore extracting components. - Use the
.dtaccessor to extract year, month, day, weekday, quarter, and time parts (hour/minute/second). - Many useful features are “hidden” inside a single datetime value (week number, day of week, month name, quarter, is-weekend, week-of-year, semester, day-of-year, etc.). Extract these as new columns to improve ML models.
- Compute timedeltas by subtracting two datetime values; extract days, hours, minutes, seconds, or total seconds from the resulting timedelta.
- Timezone work is uncommon for many problems but can be handled with appropriate libraries (
pytz/dateutil); most tasks only need basic extraction.
Step-by-step methodology
-
Inspect dtypes
- Check whether a column is object/string or already datetime:
df.info()ordf['date'].dtype
- Check whether a column is object/string or already datetime:
-
Convert string/object to datetime
- Example:
df['date'] = pd.to_datetime(df['date'])df['time'] = pd.to_datetime(df['time'])(if you have a separate time column)
- Example:
-
Extract calendar / date features (use
.dt)- Examples:
- Year:
df['year'] = df['date'].dt.year - Month (numeric):
df['month'] = df['date'].dt.month - Month name:
df['month_name'] = df['date'].dt.month_name() - Day of month:
df['day'] = df['date'].dt.day - Day of week (numeric):
df['dayofweek'] = df['date'].dt.dayofweek(Mon=0..Sun=6) - Day name:
df['day_name'] = df['date'].dt.day_name() - Is weekend:
df['is_weekend'] = df['date'].dt.dayofweek.isin([5, 6]) - Week number:
df['week'] = df['date'].dt.isocalendar().week(.dt.weekis deprecated) - Day of year:
df['day_of_year'] = df['date'].dt.dayofyear - Quarter:
df['quarter'] = df['date'].dt.quarter - Semester (example logic): map quarters 1–2 -> semester 1, quarters 3–4 -> semester 2
- Year:
- Examples:
-
Compute differences between dates (timedelta)
- Example diffs:
diff = pd.to_datetime('today') - df['date']ordf['date2'] - df['date1']- Days:
diff.dt.days - Total seconds:
diff.dt.total_seconds() - Convert to hours/minutes:
diff.dt.total_seconds() / 3600(hours) or/ 60(minutes)
- If you only want the day portion:
diff.dt.days. For component-level access, use attributes of the timedelta or convert as needed.
- Example diffs:
-
Work with time-only parts
- After conversion:
- Hour:
df['hour'] = df['time'].dt.hour - Minute:
df['minute'] = df['time'].dt.minute - Second:
df['second'] = df['time'].dt.second - Time object only:
df['time_only'] = df['time'].dt.time
- Hour:
- To compute time differences in seconds/minutes, use
diff.dt.total_seconds()and divide as appropriate.
- After conversion:
Notes & tips
- Many practical use cases (expense tracking, chat/message timestamps, e-commerce order timestamps) benefit from these extracted features.
- Keep the notebook and helper functions as a reference and reuse them in future projects that involve date/time.
- Timezone handling is rare in basic tasks; use timezone-aware conversions only when needed.
Files / datasets used in the demo
orders.csv— order-related, used for date-based feature extractionmessages.csv— chat/message timestamps, used for time-based extraction
Speakers / sources featured
- The video presenter / YouTuber (instructor) — sole speaker walking through the code and examples
Category
Educational
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.
Preparing reprocess...