split() Problem

Omukawa has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: split() Problem by Joost (Canon) on Jul 21, 2008 at 22:01 UTC
You're not asking anything, and there are so many red flags and other things that are probably mistakes here I don't know what the hell this code is supposed to do. Just the first line, then: `for $_ (%edict) { # for entries with more than one keyword` [download] That iterates of the hash keys AND values, storing each of them in $_ in other words, given the hash `%edict = ( a => 1, b => 2);` $_ would contain 'a', then 1, then 'b' and then 2. Also you're splitting whatever it is you're splitting on "; " (that's semicolon followed by a space), instead of a semicolon as you claim you want. Update: as a first attempt at clarifying this, please explain what the top-level for loop is supposed to do and how that loop's body is effected by the loop (and especially, what you're doing with $_).	[reply] [d/l] [select]
Re: split() Problem by moritz (Cardinal) on Jul 21, 2008 at 21:56 UTC
The following part of the script should look if there two words seperated by ";", split them, And then `@foo = split (/; /, $edict{$count}{english});` Note the extra space in the regex - it doesn't do what you described it should. If there's no space, then split will not match. Anyway, your example uses quite some nested data structures of which we know nothing. We also don't know what the string looks like that you want to split. If you want more or better help, give us some data (and a simpler piece of code that does the same thing). See also How (Not) To Ask A Question.	[reply] [d/l]
Re: split() Problem by CountZero (Bishop) on Jul 21, 2008 at 22:34 UTC
It looks as if `$edict{$count}` uses a counter as its key. Perhaps it is more logical to use an array rather than a hash here? When you do `for $_ (@foo)`, `$_` will sequentially contain the content of each element of the array `@foo`. Next you do `$foo[$_]`, in other words you index into the array with the value of the element. This is almost certainly wrong, unless you are trying to implement some sort of linked list. I think you just want to use the content of each element and that is already in `$_`. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply] [d/l] [select]
Re: split() Problem by Omukawa (Acolyte) on Jul 22, 2008 at 09:51 UTC
Sorry for the obscurities. As I said, I use hashes of hashes to store dictionary entries. So `%edict{0}` contains the first entry (0 is the key to the embedded hash) and `&edict{0}{english}` contains the english translation. That's mostly one word like "house" etc. but sometimes there are entries like "office; court; ....". That's why the split looks for "; ", since there is always a whitespace after the semicolon. What I'm trying to do is to split the entry with more than one english words, put the them in the array @foo and then take the first word from `@foo`, put it in another array `@engdict` with other information belongs to it, so that I will have two seperate scalars in `@engdict`: One is office : xyz, and the other is court : xyz. So that's what the code must do: 1. Find the entry about english translation with more than one word inside, seperated by "; " 2. Put them seperately in the array `@foo` 3. Take the first word from `@foo` and put it in `$engdict[$no]` with other information 4. Add one to $no 5. Take the second word and put it in `$engdict[$no+1]` and so on. The problem is this is not working as I thought. The `$foo[$_]` is always empty so I think I'm doing something wrong with the split() command but I can't see the mistake. So here is my question: Can you tell me the mistake here? Thank you	[reply] [d/l] [select]
Re^2: split() Problem by moritz (Cardinal) on Jul 22, 2008 at 10:08 UTC
I think your problem is that you do too much at once without testing the result from the intermediate steps. Most of my scripts start with `# always use those: use strict; use warnings; use Data::Dumper;` [download] When you suspect that split isn't working the way you think it is, try adding the line `print Dumper \@foo;` immediately after the split, and look if the results are what you execpted it to be. The `$foo[$_]` is always empty so `for` iterates over the values in the array, not the indexes. So instead of `$foo[$_]` just write `$_`.	[reply] [d/l] [select]
Re^3: split() Problem by Omukawa (Acolyte) on Jul 22, 2008 at 10:52 UTC
Changing `$foo[$_]` with `$_` solved the problem. Thanks a lot. Also thank you for the advice about Data::Dumper	[reply] [d/l] [select]


Clear questions and runnable code get the best and fastest answer
	PerlMonks