Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Setting $0 clears /proc/PID/environ too

by kikuchiyo (Hermit)
on Jan 16, 2020 at 17:52 UTC ( [id://11111495]=perlquestion: print w/replies, xml ) Need Help??

kikuchiyo has asked for the wisdom of the Perl Monks concerning the following question:

As the title says: on Linux, if you assign anything to $0 (with the intent to change the program's name as displayed by ps et al.), not just the the program's name and arguments are changed, but the environment (as shown in /proc/PID/environ) is cleared as well, or more precisely, filled with spaces.

The perlvar entry for $0 contains a paragraph that vaguely alludes to this:

            In some platforms there may be arbitrary amount of padding, for
            example space characters, after the modified name as shown by
            "ps". In some platforms this padding may extend all the way to
            the original length of the argument area, no matter what you do
            (this is the case for example with Linux 2.2).

So I kind of understand what happens here and why, I just find it rude.

It is somewhat more concerning that the effect persists even if you localize $0.

#!/usr/bin/perl sleep 10; { local $0 = 'changed'; sleep 10; } sleep 10;

When you run the program above, and watch a process list in a different terminal, you can observe that the apparent process name changes to 'changed' after 10 seconds, then changed back again, but if you watch the contents of /proc/PID/environ at the same time, you can see that it gets filled with spaces, then doesn't change back.

Two additional things to note:

  • The fake process name (as assigned to $0) spills over the memory that formerly contained the environment, so if you do something like $0 = 'changed' x 10000, /proc/PID/environ will contain something like "angedchangedchanged...changed\x00 ...".
  • When the localized $0 goes out of scope, Perl only makes a weak attempt to change the process name back to its original. So if you originally ran the program with arguments, the output of ps contained something like "perl foo.pl -a -b -c", but after restoration it will just be "foo.pl".

I think that an argument could be made that Perl should try to preserve the contents of /proc/PID/environ when changing $0, and do a more thorough job of restoring the original command line if $0 is localized.

Replies are listed 'Best First'.
Re: Setting $0 clears /proc/PID/environ too
by jcb (Parson) on Jan 16, 2020 at 23:10 UTC

    The problem is that the process name and /proc/PID/environ are actually windows into the address space of the process. The pointers that determine these windows are in the kernel and the program cannot update them, but you can change the data stored in the region they point to.

    Does changing $0 destroy %ENV or does perl copy the environment out of the way before reusing the original environment block?

      Does changing $0 destroy %ENV or does perl copy the environment out of the way before reusing the original environment block?

      No, that happens very early, so even BEGIN blocks have %ENV correctly filled, and in any case later changes to %ENV are not reflected in /proc/PID/environ. That file is more like a historical record of what the environment was when the process started, and as such, it has potential (if marginal) uses, and that's why this bothers me.

Re: Setting $0 clears /proc/PID/environ too
by talexb (Chancellor) on Jan 17, 2020 at 02:29 UTC

    I'm really not sure why you're finding it necessary to change the value of $0 anyway -- to me, that's a read-only value passed into your script from the environment you're running in. Why change it? What problem are you try to solve?

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      I'm really not sure why you're finding it necessary to change the value of $0

      Changing the process name is quite common in Unix. If you have access to a Linux system, run this script as root:

      #!/usr/bin/perl use v5.12; use warnings; use autodie; use Data::Dumper; for my $dir (sort grep m|^/proc/\d+$|,glob '/proc/*') { say "$dir:"; my $argv0=do { local $/="\0"; open my $f,"$dir/cmdline"; scalar <$f>; }; my $exe=eval { readlink("$dir/exe") }; say Data::Dumper->new([$argv0,$exe],[qw(argv0 exe)])->Dump(); }

      On my server, output looks like this (many boring repeated case removed):

      /proc/1: $argv0 = 'init [4]'; $exe = '/sbin/init'; /proc/10: $argv0 = undef; $exe = undef; /proc/1001: $argv0 = '/usr/sbin/ypbind'; $exe = '/usr/sbin/ypbind'; /proc/1048: $argv0 = '/usr/sbin/rpc.mountd'; $exe = '/usr/sbin/rpc.mountd'; /proc/105: $argv0 = undef; $exe = undef; /proc/1059: $argv0 = '/usr/sbin/acpid'; $exe = '/usr/sbin/acpid'; /proc/106: $argv0 = undef; $exe = undef; /proc/1069: $argv0 = '/usr/sbin/console-kit-daemon'; $exe = '/usr/sbin/console-kit-daemon'; /proc/107: $argv0 = undef; $exe = undef; /proc/1079: $argv0 = '/usr/sbin/crond'; $exe = '/usr/sbin/crond'; /proc/108: $argv0 = undef; $exe = undef; /proc/1084: $argv0 = '/usr/lib/polkit-1/polkitd'; $exe = '/usr/lib/polkit-1/polkitd'; /proc/13902: $argv0 = '-:0 '; $exe = '/usr/bin/xdm'; /proc/16697: $argv0 = 'sshd: alex [priv]'; $exe = '/usr/sbin/sshd'; /proc/16700: $argv0 = 'sshd: alex@pts/0'; $exe = '/usr/sbin/sshd'; /proc/16701: $argv0 = '-bash'; $exe = '/bin/bash'; /proc/17046: $argv0 = '/usr/bin/perl'; $exe = '/usr/bin/perl5.22.2'; /proc/5156: $argv0 = '/bin/sh'; $exe = '/bin/bash'; /proc/5225: $argv0 = '/opt/exim/bin/exim'; $exe = '/opt/exim/bin/exim-4.72-1'; /proc/5298: $argv0 = '-:1 '; $exe = '/usr/bin/xdm'; /proc/5562: $argv0 = 'postgres: checkpointer process '; $exe = '/opt/pg9/bin/postgres'; /proc/5563: $argv0 = 'postgres: writer process '; $exe = '/opt/pg9/bin/postgres'; /proc/5564: $argv0 = 'postgres: wal writer process '; $exe = '/opt/pg9/bin/postgres'; /proc/5565: $argv0 = 'postgres: autovacuum launcher process '; $exe = '/opt/pg9/bin/postgres'; /proc/5566: $argv0 = 'postgres: stats collector process '; $exe = '/opt/pg9/bin/postgres';

      Usually, you will find that argv0 is equal to the absolute path of the executable name. When run manually, it may be just the base name, or the base name of a link to the executable. This happens in the list above for perl5.22.2 invoked as perl, for bash invoked as sh, and for exim-4.72.1 invoked as exim. The init process changes its argv0 to include the runlevel. Postgres fork()s some worker processes, all running from the same executable, and changes their names to indicate their jobs. sshd does something similar to indicate privileged and unprivileged processes and their respective users. xdm changes its name to indicate the X display for which it is responsible. This allows seeing the process state even in a very simple process list.

      There are a few processes with neither argv0 nor exe defined, they are created by the Linux kernel for its own purposes.

      See also readproctitle from djb's daemontools.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        OK -- changing the process name is fine, and I understand that can be quite handy. I was asking about changing $0, which to me is one of Perl's read-only variables. Then again, Perl's a language (like C) where you can A Weird Thing and the language will shrug and say, OK, Joe! while thinking Hmm, not sure why you'd wanna do that. :) Thanks for the detailed response.

        Alex / talexb / Toronto

        Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      In this case it wasn't me who changed it, it was Plack. It does a local $0 = something as a step of a sandboxing scheme when it loads the application code. So I don't have any say in it.

      That said, there is a legitimate use case for changing the apparent process name. As perlvar suggests, it's more like a way for signaling state and displaying information than a way to hide identity.

      Consider a situation where a number of identical services are running on a host, each in a different container serving a different customer. One is misbehaving, e.g. leaking memory or eating CPU. If I change $0 in them to display the customer name or container id in the process name as a fake argument, the sysadmin logging onto the host can just glance at ps's output and immediately know which of the instances is at fault. I can even display more information in the process name, like the version number, number of active connections, unprocessed items in the queue etc.

      Starman, a Plack-based preforked web server also does this, as it displays "starman master" or "starman worker" in its process name(s).

        I had to give up using Starman early because of weird issues with internal port allocation (nothing I was doing) and segfaults. I had trouble with it on both OS X and CentOS. I have no idea if it will address your particular deployment issue but I went to uWSGI as my application engine several years ago and it’s been working great with nearly no oversight, on Ubuntu, CentOS, and OS X.

Re: Setting $0 clears /proc/PID/environ too
by Anonymous Monk on Jan 17, 2020 at 02:02 UTC

    I think that an argument could be made that Perl should try to preserve the contents of /proc/PID/environ when changing $0, and do a more thorough job of restoring the original command line if $0 is localized.

    perlbugit

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11111495]
Approved by LanX
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2024-04-18 07:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found