Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Assigning Variables to String Elements

by ccelt09 (Sexton)
on Dec 06, 2013 at 23:59 UTC ( #1066083=perlquestion: print w/replies, xml ) Need Help??

ccelt09 has asked for the wisdom of the Perl Monks concerning the following question:

Edits have been made to A Question with Nesting Arrays and I can explain more clearly now. I did indeed mean to restructure my input data by sampling with replacement, not shuffling. Each element of the input array is a string of characters, comprised itself of 4 tab delimited elements (shown below)

A1 B1 C1 D1 A2 B2 C2 D2 A3 B3 C3 D3

After restructuring my data I further subdivide the new array  @tmp with the splice function. With the last for loop and looping variable k I wish to assign each of the four elements in a string to 4 separate variables. The difficulties I am encountering: the code below assigns the length of the string, 4, to each of my variables and the for loop does not loop. For example I wish the variable pi to equal A1 but it is assigned the value 4. I don't understand why the split function isn't working or why the last for loop doesn't loop/increase k.

my $runs = 1; # for testing code #my $runs = 1000; # 1_000 - the num of times we repeat #my $runs = 100000; # 10 _000 - the num of times we repeat #my $runs = 1000000; # 100_000 - the num of times we repeat # Program vars my $i; # a looping variable my $j; # another looping variable my $k; # another looping var my $range = 1552; # total number of array elements my @tmp; # empty array to push data into #Looping Variables my $pi; my $pi_sum; my $L; my $L_sum; my $differences; my $differences_sum; my $coverage; my $coverage_sum; ; ; my $chr_X_input = "bootstrap_data.txt"; open (CHR_X_INPUT, "<$chr_X_input") or die "can't open chromosome X in +put"; my @X_info = <CHR_X_INPUT>; # Outer loop: Repeat "$runs" times for ($j = 0; $j < $runs; $j++) { for ($i = 0; $i < $range ; $i++) { # choose a randomly selected string of 4 elements from our arr +ay push (@tmp, $X_info[int(rand($range))]); } my @PAR1 = splice(@tmp, 0, 26,); for ($k = 0; $k < length(@PAR1) + 1 ; $k++) { my @PAR1_info = $PAR1[$k]; $pi = split('\t', $PAR1_info[0]); $pi_sum = $pi_sum + $pi; $L = split('\t', $PAR1_info[1]); $L_sum = $L_sum + $L; $differences = split('\t', $PAR1_info[2]); $differences_sum = $differences_sum + $differences; $coverage = split('\t', $PAR1_info[3]); $coverage_sum = $coverage_sum + $coverage; my $PAR1_diversity = (($pi_sum/$L_sum)/($differences_su +m/$coverage_sum)); } }

Replies are listed 'Best First'.
Re: Assigning Variables to String Elements
by jethro (Monsignor) on Dec 07, 2013 at 02:35 UTC

    Your @tmp-constructing loop is inside the $runs loop. And @tmp is never reset. So you add 1552 elements to @tmp, then splice the first 26 elements off. On the next run of the outer loop you add another 1552 elements to @tmp and again only splicing 26 elements off. That @tmp will be growing fast and it seems you are using only about 1/60th of it.

    The script would work exactly like before, but faster, if you changed this to

    for ($i = 0; $i < 26 ; $i++) { # choose a randomly selected string of 4 elements from our arr +ay push (@PAR1, $X_info[int(rand($range))]); } for ...
Re: Assigning Variables to String Elements
by GrandFather (Sage) on Dec 07, 2013 at 10:31 UTC

    There are two things that make it hard to provide a good answer for you. First, you seem to be writing C in Perl. That we can cope with and we can help you take advantage of Perl to clean up your code, but the bad side effect is that declaring all your variables up front makes it very hard to know what their true scope is.

    The second and more significant problem is that we don't actually know what you are trying to do. You don't give us any sample data. In fact you seem to be going out of your way to obscure what your data actually looks like. I'm fairly sure your real data isn't "characters", but is in fact "numbers" because your manipulations just don't make sense otherwise. It would help us a lot to help you if you gave some sensible sample data and some sensible looking expected output.

    You might like to use the following Perlish script based on what your sample code seems to be doing and, if it doesn't solve your problems, revise it to demonstrate where your problems are.

    use strict; use warnings; my $runs = 3; my $range = 3; my @X_info = ("1\t2\t3\t4", "1\t3\t6\t8", "1\t4\t9\t16", "1\t5\t12\t32"); # Outer loop: Repeat "$runs" times for my $run (1 .. $runs) { my @tmp; my $pi_sum; my $L_sum; my $differences_sum; my $coverage_sum; push @tmp, $X_info[rand @X_info] for 1 .. $range; my @PAR1 = splice @tmp, 0, 3,; for my $k (0 .. @PAR1 - 1) { my @PAR1_info = $PAR1[$k]; my @values = split '\t', $PAR1_info[0]; $pi_sum += $values[0]; $L_sum += $values[1]; $differences_sum += $values[2]; $coverage_sum += $values[3]; } my $PAR1_diversity = (($pi_sum / $L_sum) / ($differences_sum / $coverage_sum)); printf "%2d: %6.3f\n", $run, $PAR1_diversity; }

    Prints:

    1: 0.519 2: 0.630 3: 0.519
    True laziness is hard work
Re: Assigning Variables to String Elements
by Kenosis (Priest) on Dec 07, 2013 at 07:52 UTC

    ig effectively addressed your split issue.

    I have a question about pushing 1552 randomly-obtained strings from @X_info onto @tmp, and then doing a splice to move the first 26 from @tmp into @PAR1. (Although jethro correctly pointed out that @tmp is never reset, so strings keep getting pushed onto it. Nevertheless, the randomness doesn't essentially change.) Why not just generate those first 26 randomly-obtained strings, one at a time, and split them as you go? In both cases--1552 using the first 26 vs. just generating 26--the 26 strings were randomly obtained from a superset. Is there a certain protocol that you're following that requires you to generate all 1552 and then grab the first 26 for processing? If not, then consider the following refactoring:

    use warnings; use strict; my $runs = 1; # for testing code # Program vars my $chr_X_input = "bootstrap_data.txt"; my $range = 1552; # total number of array elem +ents my $pi_sum; my $L_sum; my $differences_sum; my $coverage_sum; open my $CHR_X_INPUT, '<', $chr_X_input or die "Can't open chromosome +X input: $!"; chomp( my @X_info = <$CHR_X_INPUT> ); close $CHR_X_INPUT; for ( 1 .. $runs ) { for ( 1 .. 26 ) { my ( $pi, $L, $differences, $coverage ) = split /\t/, $X_info[ + int( rand($range) ) ]; $pi_sum += $pi; $L_sum += $L; $differences_sum += $differences; $coverage_sum += $coverage; my $PAR1_diversity = ( ( $pi_sum / $L_sum ) / ( $differences_s +um / $coverage_sum ) ); } }

    Hope this helps!

Re: Assigning Variables to String Elements
by ig (Vicar) on Dec 07, 2013 at 00:47 UTC

    length probably isn't doing what you think it does. Try: for($k = 0; $k < @PAR1; $k++).

    Or, even better:

    for my @PAR1_info (@PAR1) { $pi = split('\t', $PAR1_info[0]); $pi_sum = $pi_sum + $pi; $L = split('\t', $PAR1_info[1]); $L_sum = $L_sum + $L; $differences = split('\t', $PAR1_info[2]); $differences_sum = $differences_sum + $differences; $coverage = split('\t', $PAR1_info[3]); $coverage_sum = $coverage_sum + $coverage; my $PAR1_diversity = (($pi_sum/$L_sum)/($differences_sum/$cove +rage_sum)); }

    Edit: As AnomalousMonk pointed out, the above code doesn't compile. That's what I get for neither thinking nor testing. What I thought I was thinking was more like the following, which does compile (after a bit of testing, fixing bugs and making presumptive changes):

    $pi_sum = 0; $L_sum = 0; $differences_sum = 0; $coverage_sum = 0; for my $PAR1 (@PAR1) { my @PAR1_info = split(/\t/, $PAR1); $pi = $PAR1_info[0]; $pi_sum = $pi_sum + $pi; $L = $PAR1_info[1]; $L_sum = $L_sum + $L; $differences = $PAR1_info[2]; $differences_sum = $differences_sum + $differences; $coverage = $PAR1_info[3]; $coverage_sum = $coverage_sum + $coverage; my $PAR1_diversity = (($pi_sum/$L_sum)/($differences_sum/$c +overage_sum)); }

    Which could be reduced to:

    $pi_sum = 0; $L_sum = 0; $differences_sum = 0; $coverage_sum = 0; for my $PAR1 (@PAR1) { my ($pi, $L, $differences, $coverage) = split(/\t/, $PAR1); $pi_sum = $pi_sum + $pi; $L_sum = $L_sum + $L; $differences_sum = $differences_sum + $differences; $coverage_sum = $coverage_sum + $coverage; my $PAR1_diversity = (($pi_sum/$L_sum)/($differences_sum/$c +overage_sum)); }

    And, prior to this, chomping the input is probably appropriate:

    my @X_info = <CHR_X_INPUT>; chomp(@X_info);
      for my @PAR1_info (@PAR1) {
          ...
      }

      This does not compile:  Missing $ on loop variable at ...

      thank you, length most certainly was not doing what I thought it was doing! Now I can only imagine I am not splitting the  @PAR1_info array properly.

        $pi = split('\t', $PAR1_info[0]);

        Split returns an array. When you assign an array to a scalar, the array is evaluated in a scalar context and an array evaluated in a scalar context gives the number of elements in the array, not the first element of the array.

        There are, as usual with Perl, several ways to get the first element. Here are a couple:

        my ($pi) = split(/\t/, $PAR1_info[0]);

        my $pi = (split(/\t/, $PAR1_info[0])[0];

        Note also that split takes a regular expression (pattern) as its first argument, not a string.

        Edit: Some days I should stay away from my keyboard....

        As dave_the_m points out, what I said about split returning an array is incorrect. In fact (as is generally the case with functions) what split does depends on the context in which it is evaluated. Split does various things differently when evaluated in scalar context rather than list or void context. Of relevance here, from split:

        Splits the string EXPR into a list of strings and returns the list in list context, or the size of the list in scalar context.

        And, while split is documented to take a pattern, that pattern can be a string.

Re: Assigning Variables to String Elements
by toolic (Bishop) on Dec 07, 2013 at 00:15 UTC

      I have both strict and warnings in my original code, they were omitted to save space. I have also printed the variables, omitted for space as well. This is how I know the split function is not working and that the variables are all being assigned the string length of 4 instead of the elements I wish to assign them.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1066083]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2020-10-29 02:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (266 votes). Check out past polls.

    Notices?