Crack the Code!

Crack the Code!#

import pandas as pd

data_url = "https://raw.githubusercontent.com/GettysburgDataScience/datasets/refs/heads/main/countypres_2000-2024.csv"
elec_df = pd.read_csv(data_url, dtype = {'fips':str})

elec_df

	year	state	state_po	county_name	county_fips	office	candidate	party	candidatevotes	totalvotes	version	mode
0	2000	ALABAMA	AL	AUTAUGA	1001.0	US PRESIDENT	AL GORE	DEMOCRAT	4942	17208	20250821	TOTAL
1	2000	ALABAMA	AL	AUTAUGA	1001.0	US PRESIDENT	GEORGE W. BUSH	REPUBLICAN	11993	17208	20250821	TOTAL
2	2000	ALABAMA	AL	AUTAUGA	1001.0	US PRESIDENT	OTHER	OTHER	113	17208	20250821	TOTAL
3	2000	ALABAMA	AL	AUTAUGA	1001.0	US PRESIDENT	RALPH NADER	GREEN	160	17208	20250821	TOTAL
4	2000	ALABAMA	AL	BALDWIN	1003.0	US PRESIDENT	AL GORE	DEMOCRAT	13997	56480	20250821	TOTAL
...	...	...	...	...	...	...	...	...	...	...	...	...
94404	2024	WYOMING	WY	WESTON	56045.0	US PRESIDENT	DONALD J TRUMP	REPUBLICAN	3069	3512	20250821	NaN
94405	2024	WYOMING	WY	WESTON	56045.0	US PRESIDENT	KAMALA D HARRIS	DEMOCRAT	378	3512	20250821	NaN
94406	2024	WYOMING	WY	WESTON	56045.0	US PRESIDENT	OTHER	OTHER	18	3512	20250821	NaN
94407	2024	WYOMING	WY	WESTON	56045.0	US PRESIDENT	OVERVOTES	NaN	1	3512	20250821	NaN
94408	2024	WYOMING	WY	WESTON	56045.0	US PRESIDENT	UNDERVOTES	NaN	20	3512	20250821	NaN

94409 rows × 12 columns

Counties with swing potential. Classification.
Economic indicator and voting.
Regression of something pre/post-covid voter turnout modeling
Predict lean based on income, education, race, religion, occupation, gender, age distribution.

Note#

The way election tallies were reported changed in 2020. Prior to 2020, only Total counts are provided. In the 2020 and 2024 elections, some states provide breakdowns by voting method (e.g. absentee, provisional), but the nomenclature varies by state.

modes = {}

modes['state'] = []
for year in range(2000,2025, 4):
    modes[str(year)] = []
        
for state in elec_df['state_po'].unique():
    modes['state'].append(state)
    
    for year in range(2000, 2025, 4):
        modes[str(year)].append(elec_df.query("state_po == @state and year==@year")['mode'].unique())
   
voting_modes = pd.DataFrame(modes)
voting_modes.tail(10)

	state	2000	2004	2008	2012	2016	2020	2024
41	SD	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL VOTES, VOTE CENTER]
42	TN	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL VOTES]
43	TX	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[nan, EARLY VOTING, TOTAL VOTES]
44	UT	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[EARLY, ELECTION DAY, MAIL, TOTAL]	[nan]
45	VT	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[nan]
46	VA	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[ABSENTEE, ELECTION DAY, PROVISIONAL]	[TOTAL VOTES]
47	WA	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[nan]
48	WV	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[nan]
49	WI	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[nan]
50	WY	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[TOTAL]	[nan]

Clues#

Below are 7 questions of increasing difficulty. Each answer is numeric. Find the answers and assign the value to the corresponding variable in the Answers section below. The secret messages are clues to a fictitious scientific breakthrough that purportedly happened today.

a. What number is in the row with index 189 under the column ‘candidatevotes’?
b. How many different counties are tabulated in this dataset?
c. What is the county_fips code for the alphabetically last county in Pennsylvania?
d. In 2024, how many election day (not mail-in or provisional) votes did DJT get in Adams County, PA?
e. How many more votes were cast in PA in the 2020 election than the 2016 election?
f. In the 2000 election, by how many votes did the Republican party win Florida?
g. Rounded to the nearest percentage point, what was the Republican lean of Adams County in 2020?

# Here's a for example

a = elec_df['candidatevotes'].iloc[189]
a

np.int64(17084)

a = 17084
b = 2
c = 3
d = 4
e = 5
f = 6
g = 7

secret_message_a = '390495152144 462308199314 269422551417375 290309197479 584684671319423198605158164111 644479264684439 580671323423619633'
secret_message_b = '591674305432605 689479189684419 607681675314421142 323305 250429550417376 488305199304 432489145198375196'

ans = [a, b, c, d, e, f, g]

message_a = decode_message(secret_message_a, ans)
message_b = decode_message(secret_message_b, ans[::-1])

print(f'Quote 1: {message_a}')
print(f'Quote 2: {message_b}')

Quote 1: Huh. ____ _____ ____ __________ _____ ______
Quote 2: _____ _____ ______ __ _____ ____ roads.

Crack the Code!

Contents

Crack the Code!#

Note#

Clues#