Two Arrays: Don’t Double Loop

I ran into a coding challenge at work. I had to arrays of objects. I needed to get the _id from artists and add it to artworks array. I did this whole dance in running two nested loops to get what I needed. Here’s a sample of the input and expected output:

const artits = [
	{
		"byline": "Clay Smith",
		"_id": "SmithC",
	},
	{
		"byline": "Rick Slick",
		"_id": "slick_rick",
	},
	{
		"byline": "Tracy Tran",
		"_id": "ttran",
	}
]

const artworks = [
	{
	  "id": "CBE220126c.jpg",
	  "name": "Clay Smith",
	},
	{
	  "id": "RS220236c.jpg",
	  "name": "Rick Slick",
	},
	{
	  "id": "TT220216c.jpg",
	  "name": "Tracy Tran",
	}
]

// output
const expectedOutput = [
	{
	  "id": "CBE220126c.jpg",
	  "name": "Clay Smith",
	  "_id": "SmithC",
	},
	{
	  "id": "RS220236c.jpg",
	  "name": "Rick Slick",
	  "_id": "slick_rick",
	},
	{
	  "id": "TT220216c.jpg",
	  "name": "Tracy Tran",
	  "_id": "ttran"
	}
]

Okay lets take a look my first attempt, with ugly nested loops.

let output = []
// O(n)^2
const artistsAndArtworks = artists.forEach((artist) => {
	artworks.forEach((artwork) => {
		if (artist.byline === artwork.name) {
			output.push({
				...artwork,
				_id: artist._id
			})
		}
		
	})
});
// console.log(output);

Why is this ugly? Well, I am iterating through one collection and then another inside of it. What that means is in a worse case scenario, such as having lots and lots of records, I have to take the first trip from start to finish for the first track, and also run through the second track start to finish (with presumably lots of records), one by one per collection. That can take up a lot time!

Our big O is n^2. That’s an exponential graph there. We have an expensive solution regarding time.

So why is this a solution though? First off, the arrays presents as two lists. Why that is meaningful is because lists mean you can loop through them with our trusty array methods!

If I was doing this manually, I imagine myself going searching for the artist byline and matching against the artwork’s name. If I find a match, then build an object with a copy (spread) of the current artwork, and add the _id from the current artist. Kind of shitty, but it works.

This is how I usually think and I know it’s not optimal. It can be better. So how can we avoid nesting loops but get the look up we want for an _id ? My pal, Freddy Montano, helped me think about the problem from a different perspective. A great way to look at this is map the name of the artist to an _id. The best thing about maps is that you can access content much easier than looping through.

There is a name in the artworks, that name can serve as a relation key. Following me? A relation key as in a key to a value object. An object! Aint finding stuff in objects pretty easy? It is.

const artistMap = artists.reduce( (acc,artist) => {
	acc[artist.byline] = artist._id
	return acc;
}, {});

const artworkMap = artworks.map(artwork => {
	return {
		...artwork,
		_id: artistMap[artwork.name]
	}
});
console.log(artworkMap);

Isn’t that more elegant? So lets take a look at the reduce() method there. I personally am still surprise in finding how powerful this method is. I sort of think a reduce() is like running a tab with a waiter. That waiter is keeping track of what you’ve been ordering in your group. The waiter in the reduce() is our accumulator.

First lets initialize our reduce() with an empty object, {}. We want to create an object to quickly look up our _id for a given artist.

...}, {});

Let’s look at the inside of the method. The real magic is in this line:

acc[artist.byline] = artist._id

By assigning our acc to a computed property, or calculated key, to the the artist id, you get an output like this:

{ "Tracy Tran": "ttran"}

Pretty neat! Now we got an object that maps from the artist name to their _id. Our returned object is assigned to artistMap.

Now we can simply loop through our collection of artwork and add our _id. But first two mental steps to achieve.

We want to copy everything in the array’s current object
Append to that object the _id. How do we look up the _id?

We can copy our object by using a spread method:

...artwork

Pretty easy! Next is a little tricky. The name is the key in artistMap object. My first thought was to do artists[something goes here. That will fail! It will fail because artists is an array and that’s not how you look up (unless you know the index). Instead I attempted to do artwork[ something goes here ]. That certainly does not make sense because artwork does not contain our _id yet.

Remember, our artistMap is an object. We can use that to look up with artwork.name to match for the byline of the artist.

_id: artistMap[artwork.name]

Seems obvious! But a good perspective to keep is the beginner’s mind. The steps to get there reveals the path. The destination is pretty clear once you get to where you are going.

Lets visualize whats going on with the _id assignment

_id: artistMap['Clay Smith'] will search for Clay Smith in our object (note that object keys are always unique).

{"Clay Smith": 'SmithC'}

The result will be _id: "SmithC". We avoid two nested loops and leverage a data structure, our little object to gain better performance. The trade off is taking up some memory for looping our collection and adding a new object.

This was fun ride. Thanks for reading.

One reply on “Two Arrays: Don’t Double Loop”

Leave a Reply Cancel reply