Remove Duplicates Multidimensional Array?

johnfl68 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, and again many thanks to the many helpful people here.

Any suggestions on removing duplicates in a Multidimensional Array/Hash (sorry, I always get the array/hash thing confused, I think it's the dyslexia)?

There are some entries that are "almost" identical, and I want to delete the duplicates

$VAR1 = {
          'vehicalData' => [
                          {
                            'manufacturer' => 'Kia',
                            'model' => 'Rio',
                            'year' => '2001',
                            'mileage' => '130256',
                            'color' => 'Metallic Blue'
                          },
                          {
                            'color' => 'Lt Blue',
                            'mileage' => '130242',
                            'year' => '2001',
                            'model' => 'Rio',
                            'manufacturer' => 'Kia'
                          },
                          {
                            'model' => 'Focus',
                            'manufacturer' => 'Ford',
                            'year' => '2003',
                            'color' => 'Red',
                            'mileage' => '100684'
                          }
                        ]
        };
[download]

I kind of understand using unique, but for a single array. I'm not really sure where to begin with this.

There are more than 3 entries, but this gives you an example. The first 2 entries are the same car, just 2 separate entries that made it in to the database, with slightly different information. I want to get rid of one of the entries.

I'm just not sure where to start on this one, so any pointers in the right direction would be greatly appreciated.

Thanks again!

Comment on Remove Duplicates Multidimensional Array? Download Code

Replies are listed 'Best First'.
Re: Remove Duplicates Multidimensional Array? by aaron_baugher (Curate) on Apr 17, 2015 at 02:17 UTC
It depends on how you determine whether two entries in the array are "the same." But in general, "removing duplicates" always follows the same principle: put the entries into a hash keyed on the elements that need to be unique, then turn that hash back into an array. For instance, if two entries having the same manufacturer, model, and year are considered "the same," I would start putting the entries into a hash with the keys being a concatenation of each entry's manufacturer, model, and year. So the first one might become this hash element: `'Kia_Rio_2001' => { 'manufacturer' => 'Kia', 'model' => 'Rio', 'year' => '2001', 'mileage' => '130256', 'color' => 'Metallic Blue' }` [download] Now when you put the second one in the hash, it will get this same key, so one of them will go away. (You'll have to determine in your code whether you want the second one to be ignored or overwrite the first.) Once this new hash is finished, pull its values into a new array, and the "duplicates" will be gone. Aaron B. Available for small or large Perl jobs and *nix system administration; see my home node.	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.


No such thing as a small change
	PerlMonks