Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^3: sorting text into sentences.

by davidj (Priest)
on Sep 06, 2004 at 08:00 UTC ( [id://388745]=note: print w/replies, xml ) Need Help??


in reply to Re^2: sorting text into sentences.
in thread sorting text into sentences.

This will do it for you.

Sample text file:
C:\Temp>type t.txt This is a test sentence. And this is another one; Also so is this. And by the way, this is the fourth sentence.
Now the perl code:
#!/usr/bin/perl use strict; use warnings; my $s; my @arr; open(FILE, "<t.txt"); while(<FILE>) { chomp $_; $s .= $_; } @arr = $s =~ m/[A-Z].+?[.;]/g; foreach (@arr) { print $_, "\n"; }
Now the output:
C:\Temp>t.pl This is a test sentence. And this is another one; Also so is this. And by the way, this is the fourth sentence.
as you can see, each array position in @arr contains 1 sentence (as you have defined it).

hope this helps,

davidj

Replies are listed 'Best First'.
Re^4: sorting text into sentences.
by chiburashka (Initiate) on Sep 06, 2004 at 08:12 UTC
    Thanks a lot, but i already made a code :
    #!/usr/bin/perl -w use Strict; $dat = "a.txt"; open(DAT, "$dat") || die "Can't open the file.\n"; @a=<DAT>; close(DAT); my $temp3; foreach (@a) {chomp $_; $temp3 .= "$_ "} @a = split(/.,;/, $temp3); foreach (@a) {$_ .= "\n";} print @a;
    "If you know the right question to ask, you already know the answer."
      From your description, you want split(/[.;]/, not /.,;/. And I don't see where capital letters would be checked.
        nevermind, thanks, that'll do :)

        p.s: i remembered that i don't check the dots and dot-apostroph themselves, so split(/.,;/, $watever) will do just fine.

      By the way, the proper invocation is "use strict;". On case-insensitive file-systems, "use Strict;" will unfortunately not complain, but won't do anything either.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://388745]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-03-28 17:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found