Thanks for your effort in understanding my problem here. I tried your code, its works perfectly fine for the example which i had posted. I didnt understand this part
foreach (@aListMembers) {
#we only care if its longer than what we already have
if (exists($hLongest{$_})) {
my $kLists = $hLongest{$_};
my $aList = $hLists{$kLists};
next if ($iCount <= scalar(@$aList));
}
hLongest is empty, how can we find if exists here?
moreover, when i added a line to the input file like this:
mylist_12 sublist153 sublist_34 sublist_123 sublist_345 sublist_245
mylist_1 sublist_153 sublist_87 sublist_876 sublist_78
mylist_6 sublist_8
mylist_2 sublist_12 sublist_34 sublist_09
mylist_3 sublist_87 sublist_09
mylist_7 sublist_8 sublist_9
mylist_9 sublist_56
the result should be:
mylist_12 sublist_153 sublist_34 sublist_123 sublist_345 sublist_245
mylist_2 sublist_12 sublist_34 sublist_09
mylist_7 sublist_8 sublist_9
mylist_9 sublist_56
but in the result, even the shorter line which has sublist_153 gets ad
+ded to result like this:
mylist_12 sublist153 sublist_34 sublist_123 sublist_345 sublist_245
mylist_1 sublist_153 sublist_87 sublist_876 sublist_78
mylist_2 sublist_12 sublist_34 sublist_09
mylist_7 sublist_8 sublist_9
mylist_9 sublist_56
In the above result, sublist_153 is present in 2 lines.
In my final output, all the lines should be unique. All the lines in the output file shouldnt have anything in common.
In your program, Are you comparing each element in each line, with each element in other lines ???? can we arrange the lines by descending order first, and then start searching "one line" with all the other lines in the file. In that case, when a match(common elements) of that "one line" in present in some other lines, all the other lines having a duplicate element can be deleted. We need not worry about the length, because, the "one line" will always be longer than the other lines in the file since we have sorted it by length in decending order. So, when we read through the whole file, "one line" would be the current line in side a foreach loop or while loop, and we will encounter only the left out lines( because, we will delete the duplicate lines when we find them while matching/looking for common elements). Hope my explanation is clear to you. Thank you once again for your kind help :)
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.