PS04#

DS256, Gettysburg College

Prof Eatai Roth

Due: Oct 3, 2025 5p (soft deadline…you may hand in up until the following Monday.)

ANNOUNCEMENT - Otter grader#

The grader.checks on this assignment have been tested. But, if you find a mistake, please report it on the ProblemSet forum on Moodle.

The questions are in no particular order in terms of difficulty, so if you feel stuck, skip the question and come back to it.

ANNOUNCEMENT - Challenge problems#

The last two problems, 5 and 6, are optional; significantly more challenging, but doable with what we’ve learned. You may do either of those problems instead of 1-4. They do not have a grader.check, so you’ll have to write some tests to demonstrate they work as described.

Your Name: …

Collaborators: …

By submitting this work, you attest that:

  • it reflects your own effort and understanding of the material

  • you did not ask for help from students outside the course, other than PLAs

  • you did not use a generative AI to produce any code used below.

# Initialize Otter
import otter
grader = otter.Notebook("ps04.ipynb")

1. Write the function sum_until() that takes as inputs a list of numbers and a target. The function should sum the numbers in order until the target value is met, then return the list of numbers used and the sum. If you get through the whole list and the target is not met, return the list and the sum of the entire list.

Hint: Use break to exit a loop once the target is met.

Example:

sum_until([3, -1, 7, 4, 9], 10) –> [3, -1, 7, 4], 13

def sum_until(num_list, target):
'your code here'
# Is it doing what you expect it to do.
print(sum_until([1,3,5,7,9,11], 20)) # should be ([1, 3, 5, 7, 9], 25)
print(sum_until([1,-1, 1, -1, 1, -1], 1)) # should be ([1], 1)
print(sum_until([1,-1, 1, -1], 2)) # should be ([1, -1, 1, -1], 0)
grader.check("q1")

2. Write the function max_compare() that takes as input two lists of numbers. The output should be a list with the greater of the two values from the input lists. If the two lists are not the same size, the output list should be the length of the shorter list and all unmatched data is discarded.

For example:

A = [1.1, 2.2, 3.3]
B = [0, 4, 2, 6, 8, 10]

max_compare(A, B)

shoudl return:

[1.1, 4, 3.3]
def max_compare(listA, listB):
'your code here'
grader.check("q2")

3. Write the function check_zipcodes() that takes as input a list of strings. The function should return a string of only valid zipcodes with the invalide entries replaced with ‘NA’ and a second list with the indices of the errors.

Hints:

  • You may consider any string of just 5 digits a valid zipcode.

  • isinstance(var, type) returns true if var is of the type.

  • isdigit() returns true if every character in a string is a digit

  • Use enumerate to get where the list of errors are.

'your code here'
zip_list = ['19003', '17325', 'MXW-1178', '00000', 901210, '61301-1842']
check_zipcodes(zip_list)
grader.check("q3")

4. Write the function groupby_artist() that takes as input an artist list and a track list. In the lists, there may be multiple tracks by an artist. The function should output an artist_list with only unique entries and a track_list in which each entry is a list of songs (a list containing lists) by the corresponding artist.

For example:

artists = ['Wye Oak', 'Tame Impala', 'Yeah Yeah Yeahs', 'Wye Oak', 'Polyphia', 'Yeah Yeah Yeahs', 'Yeah Yeah Yeahs']
tracks = ['Civilian', 'Let It Happen', 'Maps', 'Holy Holy', 'G.O.A.T.', 'Gold Lion', 'Heads Will Roll']

groupby_artist(artists, tracks)

should return:

(['Wye Oak', 'Tame Impala', 'Yeah Yeah Yeahs', 'Polyphia'],
 [['Civilian', 'Holy Holy'],
  ['Let It Happen'],
  ['Maps', 'Gold Lion', 'Heads Will Roll'],
  ['G.O.A.T.']])

def groupby_artist(artist_list, track_list):
'your code here'
artists = ['Wye Oak', 'Tame Impala', 'Yeah Yeah Yeahs', 'Wye Oak', 'Polyphia', 'Yeah Yeah Yeahs', 'Yeah Yeah Yeahs']
tracks = ['Civilian', 'Let It Happen', 'Maps', 'Holy Holy', 'G.O.A.T.', 'Gold Lion', 'Heads Will Roll']

groupby_artist(artists, tracks)
grader.check("q4")

5. (Optional challenge…you can do this problem instead of all the others)

Write the two functions snippet() and unsnippet().

snippet() takes as input a string of text and splits it into words. The function returns a list of unique words and a corresponding list of lists with the indices each word is found out. A word is unique if any character is different (e.g. ‘the’ and ‘The’ are unique, ‘end’ and ‘end?’ are unique)

unsnippet() takes as input a list of unique words and a list-of-lists of the indices of the words and reconstructs the original string.

Do not use the .join() string method.

...

my_string = "I'm a mother pheasant plucker. I pluck mother pheasants. I'm the most pleasant mother pheasant plucker ever to pluck a pheasant."
uniques, idxs = snippet(my_string)
unsnippet(uniques, idxs)

6. (Optional challenge…you can do this problem instead of all the others) Write the function spell_out_number() that will write out any whole number between 1 - 1,000,000,000 (1 billion) as a string.

In a cell below your function code, give several examples of your function working.

Example:

spell_out_number(117) –> “one hundred seventeen”

spell_out_number(2,879,012) –> “two million, eight hundred seventy-nine thousand, twelve”

Submission#

When you are ready to submit, uncomment (cmd+/ or ctrl+/) and run the cell below.

!git add *
!git commit -m 'ps04 submitted'
!git push -f