Today I Learned Ruby's Uniq Method Has Superpowers Using Ruby's Uniq Method with Multiple Conditions Cleaned Up a Dataset
I kinda always took the .uniq
method for granted with just appending it to an Array.uniq.count
type of usage. Not any more.
I have a project I’m working on that uses zipcodes to link a lot of data together. I have been looking at a way to reduce my table size from a 35K+ row count to a possible 3K row count by consolidating my data into unique values and passing the zipcodes into an array to optimize this dataset. I usually take the .uniq
for granted when I’m using it to debug something. Once I took a closer look at the flexibility it has, I found it helpful in my use case.
=> #<County2Zipcode:0x0000560360f9b758
id: 1,
state_fips: "1",
state: "Alabama",
state_abbr: "AL",
zipcode: "35004",
county: "St. Clair",
city: "Acmar"
=> #<CountyWithZipcode:0x0000560360124d50
id: 3937,
state: "Alabama",
state_abbr: "AL",
county: "St. Clair",
zipcodes: ["35004", "35052", "35054", "35112", "35120", "35125", "35128", "35131", "35135", "35146", "35953", "35987"]>
RubyGuides had a cool post about the uniq method’s multiple condition functionality. This would allow me to pass the conditions in a block and transform the unique data in a way to create consolidated records. I was able to extract unique rows to build the new base CountyWithZipcode
model, then conduct a second pass to pull all the zipcodes into an array. Here’s an example of the usage I found super helpful.
# Create Uniq Records from County2Zipcode
County2Zipcode.all.uniq {|record| [record.state, record.state_abbr, record.county]}.each do |new_record|
valid_keys = ["state", "state_abbr", "county"]
attrs = new_record.dup.slice(*valid_keys)
CountyWithZipcode.create(attrs)
end