Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » POSIX popen and ARG_MAX

This thread is locked; no one can reply to it. rss feed Print
POSIX popen and ARG_MAX
Hyena_
Member #8,852
July 2007
avatar

Does popen leak file descriptors to the command it executes? For example, I call curl with popen and end it with & ampersand so that my main process would immediately continue without waiting for curl to finish.

How to programmatically find out what is the maximum argument length popen can handle? ARG_MAX is not it because when I tested this popen was failing already with argument length of ARG_MAX/30.

bamccaig
Member #7,536
July 2006
avatar

popen does not execute the command directly. If it did, "&" would have no meaning. The command is being interpreted by /bin/sh -c. I wonder if your command would be killed by -c if not also disowned by the sh process? I'm not sure.

In glibc popen is basically just calling __execve, which as far as I can tell is just the system call implemented in the kernel in Linux. The only thing I saw in glibc is that if the number of arguments passed is greater than INT_MAX then it will error out. That's a minimum of 2-billion and change so unless you have a very exceptional command that likely isn't being hit.

The only information I could find was that it used to be based on ARG_MAX (in my Linux system that is defined as 128 kB) or on MAX_ARG_PAGES, which is apparently 32. With 4 kB pages that again yields 128 kB. That also includes the size of the environment so keep that in mind. These days it's based on 1/4 of RLIMIT_STACK, which is the stack size limit for a process. I don't know what that is, but I would guess it's in the hundreds of MiB or GiB (depending on your hardware and architecture probably). Of course, it would take out of the space available to the program for its stack, but again it seems unlikely you'd reach any of these limits. This is all specific to Linux though. You haven't told us which OS you're using.

The question is, how long is your command (keep in mind there is an additional size of "/bin/sh -c " added in the case of popen)? And how do you know that it's failing because of the length of the command? It might help if you tell us exactly what errors or return codes you're getting.

Append:

sleep.c#SelectExpand
1#include <stdio.h> 2#include <stdlib.h> 3#include <string.h> 4#include <unistd.h> 5 6int main(int argc, char * argv[]) 7{ 8 int remaining, timeout; 9 10 if (argc < 2) { 11 fputs("Too few arguments.\n", stderr); 12 exit(1); 13 } 14 15 timeout = atoi(argv[1]); 16 17 sleep_: remaining = sleep(timeout); 18 19 if (remaining != 0) { 20 timeout = remaining; 21 goto sleep_; 22 } 23 24 return 0; 25}

main.c#SelectExpand
1#include <errno.h> 2#include <stdio.h> 3#include <stdlib.h> 4#include <string.h> 5 6int main(int argc, char * argv[]) 7{ 8 FILE * handle = popen("./sleep 3500 &", "w"); 9 10 if (handle == NULL) { 11 fprintf(stderr, "Failed to ^Z cat fork. (%d)\n", errno); 12 exit(1); 13 } 14 15 int ret = pclose(handle); 16 17 fprintf(stderr, "./sleep returned %d\n", ret); 18 19 return 0; 20}

The program immediately returns.

$ gcc -o sleep -Wall sleep.c
$ gcc -o main -Wall main.c
$ ./main
./sleep returned 0

I think that the '&' is causing 'sh -c' to halt it immediately. My sh is dash and the manpage only says that if the shell is not interactive then stdin is set to /dev/null. I don't think that should cause this behavior so I'm not sure.

Append:

OK, here we go. Now I see. If I list processes with ps then "./sleep 3500" is still running. That makes a bit of sense. You told sh to background it so it did just that. sh was non-interactive so it exited normally. ./sleep is left running doing its thing. If you need control of those curl processes so they can't be left behind like this then you should probably not be using popen. You may need to use fork and exec or something like that to achieve this properly.

Hyena_
Member #8,852
July 2007
avatar

I'm programming it on Linux Mint but ideally it should be POSIX compliant. Below is the problematic function. I just added spaces at the end of the cfg parameter before calling str2hex on it. At some point the command didn't execute properly because netcat didn't respond me with anything.

Ideally I'd like to programmatically find out the maximum length of a single argument given to popen as the command to execute because then I could implement a fallback to named pipes.

#SelectExpand
1void TREASURER::bitcoin_rpc(const char *method, const nlohmann::json* params) { 2 /* 3 * Instead of making a blocking cURL request here we are spawning a child 4 * process with popen so that we can carry on with the main program while 5 * the request is being executed. When the child process finishes it will 6 * connect back to the main server providing us the response from Bitcoin RPC. 7 * This clever trick achieves asynchronous HTTP requests without using threads 8 * in our main process. 9 */ 10 if (manager->get_global("auth-cookie") == nullptr) { 11 manager->bug("Unable to execute Bitcoin RPC '%s': cookie not found.", method); 12 return; 13 } 14 15 nlohmann::json json; 16 json["jsonrpc"] = "1.0"; 17 json["id"] = method; 18 json["method"] = method; 19 if (params) json["params"] = *params; 20 else json["params"] = nlohmann::json::array(); 21 //std::cout << json.dump(4) << std::endl; 22 23 std::string cfg; 24 cfg.append("--url http://127.0.0.1:8332/\n"); 25 cfg.append("--max-time 10\n"); 26 cfg.append("-u "); 27 cfg.append(manager->get_global("auth-cookie")); 28 cfg.append(1, '\n'); 29 cfg.append("-H \"content-type: text/plain;\"\n"); 30 cfg.append("--data-binary @-\n"); 31 cfg.append(json.dump()); 32 33 std::string hex; 34 str2hex(cfg.c_str(), &hex); 35 36 std::string command = "printf \"%s\" \""; 37 command.append(hex); 38 command.append(1, '\"'); 39 command.append(" | xxd -p -r "); 40 command.append(" | curl -s --config - "); 41 command.append(" | xargs -0 printf 'su\nsend "); 42 command.append(std::to_string(id)); 43 command.append(" %s\nexit\nexit\n'"); 44 command.append(" | netcat -q -1 localhost "); 45 command.append(manager->get_tcp_port()); 46 command.append(" > /dev/null 2>/dev/null &"); 47 48 FILE *fp = popen(command.c_str(), "r"); // Open the command for reading. 49 if (!fp) manager->bug("Unable to execute '%s'.\n", command.c_str()); 50 else { 51 pclose(fp); 52 manager->vlog("Bitcoin RPC ---> %s", method); 53 } 54}

I would prefer to avoid calling fork and pipe myself here because I use signals and I don't want to drive up the code complexity if it's not absolutely needed.

edit:
The below resource indicates that there's MAX_ARG_STRLEN which is 131072
https://www.in-ulm.de/~mascheck/various/argmax/

This number looks suspiciously similar to the approximate command length that caused popen to fail. ARG_MAX is 2097152 in my system. When I divide it with 20 I get 104857 which is the kind of length where popen failed in my tests.

bamccaig
Member #7,536
July 2006
avatar

Maybe it would help to split up the command into a few different steps. I've gathered and hacked up this little program to assist with both writing and reading to a program as a pipe. Might be helpful with such a large pipeline.

popen2.h#SelectExpand
1#ifndef POPEN2 2 #define POPEN2 3 4int popen2(char *, char * const [], int *, int *, int *); 5 6#endif

popen.c#SelectExpand
1#include <errno.h> 2#include <stdio.h> 3#include <stdlib.h> 4#include <string.h> 5#include <unistd.h> 6 7static void close_pipe(int pipe[2]) { 8 close(pipe[0]); 9 close(pipe[1]); 10} 11 12static int dual_pipe( int pipe_lhs[2], 13 int pipe_rhs[2]) 14{ 15 int i, l; 16 17 if (pipe(pipe_lhs) != 0) { 18 perror("pipe"); 19 return -1; 20 } 21 22 if (pipe(pipe_rhs) != 0) { 23 close_pipe(pipe_lhs); 24 perror("pipe"); 25 return -1; 26 } 27 28 return 0; 29} 30 31int popen2( 32 const char * command, 33 char * const argv[], 34 int * pid, 35 int * infd, 36 int * outfd) 37{ 38 int pipe_lhs[2]; 39 int pipe_rhs[2]; 40 41 if (dual_pipe(pipe_lhs, pipe_rhs) != 0) { 42 return -1; 43 } 44 45 *pid = fork(); 46 47 if (*pid == -1) { 48 perror("fork"); 49 return -1; 50 } 51 52 // Parent. 53 if (*pid) { 54 *infd = pipe_lhs[1]; 55 *outfd = pipe_rhs[0]; 56 close(pipe_lhs[0]); 57 close(pipe_rhs[1]); 58 59 return 0; 60 } 61 62 // Child. 63 dup2(pipe_lhs[0], 0); 64 dup2(pipe_rhs[1], 1); 65 close_pipe(pipe_lhs); 66 close_pipe(pipe_rhs); 67 68 execvp(command, argv); 69 fprintf(stderr, "error running %s: %s\n", command, strerror(errno)); 70 abort(); 71}

main.c#SelectExpand
1#include <stdio.h> 2#include <stdlib.h> 3#include <unistd.h> 4 5#include "popen2.h" 6 7#define buf_length (sizeof(buf)/sizeof(char)) 8 9int main(int argc, char * argv[]) 10{ 11 char buf[50]; 12 char * const tee_argv[] = {"tee"}; 13 int len, pid, infd, outfd; 14 15 if (popen2(tee_argv[0], tee_argv, &pid, &infd, &outfd) != 0) { 16 exit(1); 17 } 18 19 write(infd, "Hello\n", 6); 20 21 len = read(outfd, buf, buf_length - 1); 22 23 buf[len] = '\0'; 24 25 printf("%s says: %s", tee_argv[0], buf); 26 27 if(!len || buf[len - 1] != '\n') { 28 putchar('\n'); 29 } 30 31 close(infd); 32 close(outfd); 33 34 return 0; 35}

$ gcc -c -g -o popen2.o -Wall popen2.c
$ gcc -c -g -o main.o -Wall main.c
$ gcc -g -o popen2 -Wall popen2.o main.o
$ ./popen2
tee says: Hello

Hyena_
Member #8,852
July 2007
avatar

@bamccaig

your solution seems attractive. I'm worried about fork though because I don't know exactly how to use it safely. Should I block all signals before calling fork or pipe?

edit:
Also, should I be using FD_CLOEXEC when calling popen?

The type argument is a pointer to a null-terminated string which must
contain either the letter 'r' for reading or the letter 'w' for
writing. Since glibc 2.9, this argument can additionally include the
letter 'e', which causes the close-on-exec flag (FD_CLOEXEC) to be
set on the underlying file descriptor; see the description of the
O_CLOEXEC flag in open(2) for reasons why this may be useful.

bamccaig
Member #7,536
July 2006
avatar

I'm uncertain about the importance of FD_CLOEXEC. It almost sounds like it should almost always be on by default for popen. Unless you happened to be popening a special command that expected that file descriptor to be open...

I think it's safe to use it here, but I think that's also a Linux extension so it will make the program less portable I think, which you expressed as one of your goals. The alternative, I think, would require you to wrap the program you wanted to run (e.g., curl) in a wrapper program or script that closes that file descriptor first. That said, I think that the called program would have to operate on a file descriptor that it didn't open for it to be affected, and it would be getting a file descriptor to one of its stdin/stdout handles anyway so it's probably pretty harmless...

Sounds like bash and dash both close a file descriptor if you redirect it to "&-". So this wrapper might suffice (but you'd have to ensure all supported shells work this way too, or you'd again limit portability):

cloexec.sh#SelectExpand
1#!/bin/sh 2 3if [ $# -le 1 ]; then 4 cat <<EOF 1>&2; 5Usage: $0 FD COMMAND [ARG...] 6 7FD should be an integer file descriptor. FD will be closed immediately. 8 9Command will then be executed, receiving ARG... arguments. 10EOF 11 exit 1; 12fi; 13 14eval "$1>&-" || exit $?; 15shift || exit $?; 16"$@";

That's in theory... In practice, this doesn't seem to work. :-/ Writing an equivalent C program that goes along with your program might be more portable anyway... But then that would probably have to reimplement popen to achieve this portably so probably the answer is just don't use popen if portability matters and this leaked file descriptor is significant.

GullRaDriel
Member #3,861
September 2003
avatar

In your case I would tend to use curl API directly in the code, which would avoid the command length problem, shell invocation & stuff.

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Hyena_
Member #8,852
July 2007
avatar

I was hinted in another forum that cURL has something like multi interface: https://curl.haxx.se/libcurl/c/libcurl-multi.html

While this could probably allow me to make very fast HTTP requests straight from my main program in a non-blocking manner without using fork, it certainly would have a learning curve. So far I have only used simple curl requests API in my practice. This multi interface definitely looks promising but since I need to get this done already I wouldn't like to spend more time researching completely new possibilities to solve the matter. In fact, yesterday I made some pretty good progress with the initial approach. Below is the code that falls back to named pipes if the command line would be too long.

#SelectExpand
1void TREASURER::bitcoin_rpc(const char *method, const nlohmann::json* params) { 2 /* 3 * Instead of making a blocking cURL request here we are spawning a child 4 * process with popen so that we can carry on with the main program while 5 * the request is being executed. When the child process finishes it will 6 * connect back to the main server providing us the response from Bitcoin RPC. 7 * This clever trick achieves asynchronous HTTP requests without using threads 8 * in our main process. 9 */ 10 if (manager->get_global("auth-cookie") == nullptr) { 11 manager->bug("Unable to execute Bitcoin RPC '%s': cookie not found.", method); 12 return; 13 } 14 15 nlohmann::json json; 16 json["jsonrpc"] = "1.0"; 17 json["id"] = method; 18 json["method"] = method; 19 if (params) json["params"] = *params; 20 else json["params"] = nlohmann::json::array(); 21 //std::cout << json.dump(4) << std::endl; 22 23 long arg_max = sysconf(_SC_ARG_MAX); 24 size_t max_len = arg_max/16 - 1; 25 26 std::string cfg; 27 cfg.append("--url http://127.0.0.1:8332/\n"); 28 cfg.append("--max-time 10\n"); 29 cfg.append("-u "); 30 cfg.append(manager->get_global("auth-cookie")); 31 cfg.append(1, '\n'); 32 cfg.append("-H \"content-type: text/plain;\"\n"); 33 cfg.append("--data-binary @-\n"); 34 cfg.append(json.dump()); 35 36 std::string hex; 37 str2hex(cfg.c_str(), &hex); 38 39 std::string command = "{ { printf 'su\\nsend "; 40 command.append(std::to_string(id)); 41 command.append(" '"); 42 command.append(" ; printf '%s' '"); 43 command.append(hex); 44 command.append(1, '\''); 45 command.append(" | xxd -p -r"); 46 command.append(" | curl -s --config -"); 47 command.append(" ; printf '\\nexit\\nexit\\n' ; }"); 48 command.append(" | netcat -q -1 127.0.0.1 "); 49 command.append(manager->get_tcp_port()); 50 command.append(" ; } &"); 51 52 manager->vlog("Bitcoin RPC ---> %s", method); 53 if (command.length() > max_len) { 54 char salt[16]; 55 generate_salt(salt, sizeof(salt)); 56 std::string fifo = "./data/tmp/"; 57 fifo.append(salt); 58 fifo.append(".fifo"); 59 60 if ( (errno = 0) || mkfifo(fifo.c_str(), 0600) != 0) { 61 manager->bug("%s: mkfifo %s: %s", __FUNCTION__, fifo.c_str(), strerror(errno)); 62 return; 63 } 64 65 command = "{ { printf 'su\\nsend "; 66 command.append(std::to_string(id)); 67 command.append(" '"); 68 command.append(" ; cat "); 69 command.append(fifo); 70 command.append(" | curl -s --config -"); 71 command.append(" ; printf '\\nexit\\nexit\\n' ; }"); 72 command.append(" | netcat -q -1 127.0.0.1 "); 73 command.append(manager->get_tcp_port()); 74 command.append(" ; rm -f "); 75 command.append(fifo); 76 command.append(" ; } &"); 77 78 FILE *fp = popen(command.c_str(), "r"); // Open the command for reading. 79 if (!fp) { 80 manager->bug("Unable to execute '%s'.\n", command.c_str()); 81 return; 82 } 83 if (pclose(fp) == -1) { 84 manager->bug("%s: pclose: %s", __FUNCTION__, strerror(errno)); 85 return; 86 } 87 88 { 89 signals.block(); 90 int fd = open(fifo.c_str(), O_WRONLY); 91 92 if (fd == -1) { 93 manager->bug("%s: open %s: %s", __FUNCTION__, fifo.c_str(), strerror(errno)); 94 signals.unblock(); 95 return; 96 } 97 98 ssize_t written = write(fd, cfg.c_str(), cfg.size()); 99 if (written != (ssize_t) cfg.size()) { 100 if (written == -1) manager->bug("%s: write: %s", __FUNCTION__, strerror(errno)); 101 else manager->bug("%s: write: wrong number of bytes written"); 102 } 103 if ( (errno = 0) || close(fd) != 0) manager->bug("%s: close: %s", __FUNCTION__, strerror(errno)); 104 signals.unblock(); 105 } 106 return; 107 } 108 109 FILE *fp = popen(command.c_str(), "r"); // Open the command for reading. 110 if (!fp) { 111 manager->bug("Unable to execute '%s'.\n", command.c_str()); 112 return; 113 } 114 if (pclose(fp) == -1) { 115 manager->bug("%s: pclose: %s", __FUNCTION__, strerror(errno)); 116 return; 117 } 118}

It works well but there are a couple enhancements that I'd really need. Firstly there are these 2 lines:

long arg_max = sysconf(_SC_ARG_MAX);
size_t max_len = arg_max/16 - 1;

With trial and error I found out that maximum command length is ARG_MAX/16 - 1. Could there be any portable way of determining that limit? I'm afraid ARG_MAX/16 could be either a coincidence or specific to my particular development platform.

I will also probably have to add some CLOEXEC flags here and there.

GullRaDriel
Member #3,861
September 2003
avatar

You should consider to put curl calls in a fork because depending the version it's known to leak, so if it happens to you there is already a solution.

The command line max size depends of the host.

For the command max size here is what I got on linux:

gull@Althea:~$ xargs --show-limits
Vos variables d'environnement occupent 2625 octets
Limite supérieure POSIX de longueur d'argument (sur ce systme): 2092479
Plus petite limite haute POSIX de longueur d'argument autorisée (tous systèmes) : 4096
Longueur maximale de la commande qui pourrait être utilisée : 2089854
Taille du tampon de commande actuellement utilisé : 131072
Parallélisation maximum (--max-procs ne peut pas être plus grand): 2147483647

L'excution de xargs va continuer maintenant et tenter de lire les donnes en entre et excuter les commandes; si vous ne le voulez pas, pressez <Ctrl-D> (EOF).
Attention: echo va s'exécuter au moins une fois. Si vous ne le voulez pas, pressez les touches d'interruption.

That should work for linux systems. It shows that I can have a maximum 2MB command line.

For windows: https://support.microsoft.com/fr-fr/help/830473/command-prompt-cmd.-exe-command-line-string-limitation

You may switch to english to have a better understanding of what it says.

Edited.

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Hyena_
Member #8,852
July 2007
avatar

Any idea how to programmatically calculate the value shown on this line:
Taille du tampon de commande actuellement utilisé : 131072

Or perhaps you know some guaranteed minimum command line buffer size and how to get its value? It seems to me that 4096 is the guaranteed minimum argument length but I am not 100% sure about that.

GullRaDriel
Member #3,861
September 2003
avatar

For my system: the currently allocated buffer is 131072 of 2 usable MB.

That link may got what you want:
https://www.in-ulm.de/~mascheck/various/argmax/

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

bamccaig
Member #7,536
July 2006
avatar

Hyena_
Member #8,852
July 2007
avatar

@bamccaig

I did look at xargs source. :D And it was ugly. I mean there were quite a few checks to determine that length and in some cases I believe the fallback was just to gradually increase the length of a string and feed it to command line interpreter until it fails. I believe I did see some reference to a guaranteed minimum buffer size for the command line and it was 4096. So I am now simply using that number in my code because in my case it is enough in 99% of cases. And those 1% of cases when 4096 is not enough then even 131072 would not be enough so I'm using named pipes fallback anyway.

Here's the new version of my function. I had to get rid of curly brackets between printf commands because turned out that when curl made requests to remote hosts rather than to localhost then the delay between printf and curl subsequent commands were so big that netcat registered an EOF from the first call to printf, after which curl closed with error "23 failed writing body". This problem was not manifesting when I tested locally. A fixed version uses xargs to turn the stdin into a command line parameter of another command.

#SelectExpand
1void TREASURER::bitcoin_rpc(const char *method, const nlohmann::json* params) { 2 /* 3 * Instead of making a blocking cURL request here we are spawning a child 4 * process with popen so that we can carry on with the main program while 5 * the request is being executed. When the child process finishes it will 6 * connect back to the main server providing us the response from Bitcoin RPC. 7 * This clever trick achieves asynchronous HTTP requests without using threads 8 * in our main process. 9 */ 10 if (manager->get_global("auth-cookie") == nullptr) { 11 manager->bug("Unable to execute Bitcoin RPC '%s': cookie not found.", method); 12 return; 13 } 14 15 if (manager->get_global("wallet-host") == nullptr) { 16 manager->bug("Unable to execute Bitcoin RPC '%s': wallet host not set.", method); 17 return; 18 } 19 20 if (manager->get_global("wallet-port") == nullptr) { 21 manager->bug("Unable to execute Bitcoin RPC '%s': wallet port not set.", method); 22 return; 23 } 24 25 nlohmann::json json; 26 json["jsonrpc"] = "1.0"; 27 json["id"] = method; 28 json["method"] = method; 29 if (params) json["params"] = *params; 30 else json["params"] = nlohmann::json::array(); 31 //std::cout << json.dump(4) << std::endl; 32 33 size_t max_len = 4096 - 1; // Maximum command line length. 34 35 std::string request = json.dump(); 36 std::string cfg; 37 cfg.append("--url http://"); 38 cfg.append(manager->get_global("wallet-host")); 39 cfg.append(1, ':'); 40 cfg.append(manager->get_global("wallet-port")); 41 cfg.append("/\n"); 42 cfg.append("--max-time 10\n"); 43 cfg.append("-u "); 44 cfg.append(manager->get_global("auth-cookie")); 45 cfg.append(1, '\n'); 46 cfg.append("-H \"content-type: text/plain;\"\n"); 47 cfg.append("--data-binary @-\n"); 48 cfg.append(request); 49 50 std::string hex; 51 str2hex(cfg.c_str(), &hex); 52 53 std::string command = "printf \"%s\" \""; 54 command.append(hex); 55 command.append(1, '\"'); 56 command.append(" | xxd -p -r "); 57 command.append(" | curl -s --config - "); 58 command.append(" | xargs -0 printf 'su\\nsend "); 59 command.append(std::to_string(id)); 60 command.append(" %s\\nexit\\nexit\\n'"); 61 command.append(" | netcat -q -1 127.0.0.1 "); 62 command.append(manager->get_tcp_port()); 63 command.append(" &"); 64 65 66 if (command.length() > max_len) { 67 manager->log("Bitcoin RPC ---> %s (%lu bytes)", method, request.size()); 68 char salt[16]; 69 generate_salt(salt, sizeof(salt)); 70 std::string fifo = options.datadir; 71 fifo.append("/tmp/"); 72 fifo.append(salt); 73 fifo.append(".fifo"); 74 75 if ( (errno = 0) || mkfifo(fifo.c_str(), 0600) != 0) { 76 manager->bug("%s: mkfifo %s: %s", __FUNCTION__, fifo.c_str(), strerror(errno)); 77 return; 78 } 79 80 command = "{ { printf 'su\\nsend "; 81 command.append(std::to_string(id)); 82 command.append(" '"); 83 command.append(" ; cat "); 84 command.append(fifo); 85 command.append(" | curl -s --config -"); 86 command.append(" ; printf '\\nexit\\nexit\\n' ; }"); 87 command.append(" | netcat -q -1 127.0.0.1 "); 88 command.append(manager->get_tcp_port()); 89 command.append(" ; rm -f "); 90 command.append(fifo); 91 command.append(" ; } &"); 92 93 FILE *fp = popen(command.c_str(), "re"); // Open the command for reading. 94 if (!fp) { 95 manager->bug("Unable to execute '%s'.\n", command.c_str()); 96 return; 97 } 98 if (pclose(fp) == -1) { 99 manager->bug("%s: pclose: %s", __FUNCTION__, strerror(errno)); 100 return; 101 } 102 103 { 104 signals.block(); 105 int fd = open(fifo.c_str(), O_CLOEXEC|O_WRONLY); 106 107 if (fd == -1) { 108 manager->bug("%s: open %s: %s", __FUNCTION__, fifo.c_str(), strerror(errno)); 109 signals.unblock(); 110 return; 111 } 112 113 ssize_t written = write(fd, cfg.c_str(), cfg.size()); 114 if (written != (ssize_t) cfg.size()) { 115 if (written == -1) manager->bug("%s: write: %s", __FUNCTION__, strerror(errno)); 116 else manager->bug("%s: write: wrong number of bytes written"); 117 } 118 if ( (errno = 0) || close(fd) != 0) manager->bug("%s: close: %s", __FUNCTION__, strerror(errno)); 119 signals.unblock(); 120 } 121 return; 122 } 123 else manager->vlog("Bitcoin RPC ---> %s", method); 124 125 FILE *fp = popen(command.c_str(), "re"); // Open the command for reading. 126 if (!fp) { 127 manager->bug("Unable to execute '%s'.\n", command.c_str()); 128 return; 129 } 130 if (pclose(fp) == -1) { 131 manager->bug("%s: pclose: %s", __FUNCTION__, strerror(errno)); 132 return; 133 } 134}

edit:
Damn, I almost forgot that the fallback also has to get rid off semicolons and printf to add a prefix and suffix for the string returned by curl before feeding it to netcat. And in case of non-fallback if the string returned by curl is too long then xargs will fail also.

Any ideas how to add a prefix and suffix to the string that we get from stdin before directing it to stdout?

GullRaDriel
Member #3,861
September 2003
avatar

I don't quite get the problem with your last question.

1) You get a sting from stdin which you store in string_in
2) You allocate a new empty string newstring
3) append prefix to newstring
4) append string_in to newstring
5) append suffix to newstring
6) cout newstring ?

or shorter:

1)Prepare a new empty string
2)append prefix
3)append stdin
4)append suffix
5)cout newstring

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

Hyena_
Member #8,852
July 2007
avatar

I just made some more progress.

Here is my old code that had a weird behavior. My program was never reported back from netcat when I spawned the curl with popen like that:

{ { printf '%s' '73750a6c6f6720776f6f7420' | xxd -p -r ; printf '%s' '2d2d75726c2068747470733a2f2f626974636f696e666565732e32312e636f2f6170692f76312f666565732f7265636f6d6d656e6465640a2d2d6d61782d74696d652031300a' | xxd -p -r | curl -N -s --config - ; printf '%s' '0a657869740a657869740a' | xxd -p -r ; } | netcat -I 23 -O 23 -q 60 -w 60 127.0.0.1 4000 ; rm -f ./data/tmp/fa37JncCHryDsbz.fifo ; } &

I was unable to figure out why the shell started from popen did not finish properly. I still don't quite know why the previous code does not work but after adding unbuffer -p in front of the netcat command everything started to work. It is even more strange that stdbuf -i0 -o0 -e0 in front of netcat did not make any difference. So the unbuffer command does some magic that made my program work.

{ { printf '%s' '73750a6c6f6720776f6f7420' | xxd -p -r ; printf '%s' '2d2d75726c2068747470733a2f2f626974636f696e666565732e32312e636f2f6170692f76312f666565732f7265636f6d6d656e6465640a2d2d6d61782d74696d652031300a' | xxd -p -r | curl -N -s --config - ; printf '%s' '0a657869740a657869740a' | xxd -p -r ; } | unbuffer -p netcat -I 23 -O 23 -q 60 -w 60 127.0.0.1 4000 ; rm -f ./data/tmp/fa37JncCHryDsbz.fifo ; } &

This stuff is quite hard to debug because when I start those commands from my terminal window they both finish nicely. The first version of the command is only malfunctioning when it is started with popen from the very process that is supposed to receive the curl response via netcat in the end.

bamccaig
Member #7,536
July 2006
avatar

The issue then is that somehow buffered IO was causing something to fail?

Site note: String manipulation is going to require "slow" memory allocations that likely will be immediately thrown out after. Avoid needless string memory options using a std::stringstream to build the final string or writing directly to the output stream instead of concatenating into a string first.

1) You get a sting from stdin which you store in string_in
2) cout prefix
3) cout string_in
4) cout suffix ?

FTFY. Or if you need the whole string, use the stringstream as above. I imagine it's more efficient (modern libraries usually have a "StringBuilder" or equivalent that does this same thing to avoid constantly reallocating string objects...at least in C++ the same buffer might just be getting extended without copying).

While I'm nitpicking, printf '%s' 'x' is the same as echo -n 'x' (but maybe that's not as portable?).

While I'm on the subject of portability, my netcat in Linux doesn't have -I or -O options... :-/ Are you sure that's portable?

Hyena_
Member #8,852
July 2007
avatar

Well those -I -O options didn't do anything good anyway. So I have already dropped these. Everything would have been perfect if unbuffered would have worked for large responses too. This is getting sad already :D turns out the unbuffer function gave me only the first n bytes of the curl response so it is of no use. I am running out of ideas.

edit:
This is becoming ridiculous. I want to add just a string prefix and suffix to whatever curl writes to its stdout before feeding it to netcat. And it does not work, I've tried hundres of combinations, using stdbuf and unbuffer, some hacks with variables. Unbuffer almost worked but it started failing on larger curl responses.

About std::strings being slow, I'm not buying it. Let's say I reserve 1000 bytes for the string right away and then append to it, no way it would be slow.

bamccaig
Member #7,536
July 2006
avatar

Maybe it would help to debug to write each stdout to files so that you can inspect the results at each stage? Instead of piping, write to a file, and then substitute that file for the stdin of the next process.

Append:

Agreed, if you reserve space for the string ahead of time you can avoid the overhead. That is, if you know a reasonable size to use. If it can vary a lot, you could be wasting lots of memory or still run into reallocation overhead.

Hyena_
Member #8,852
July 2007
avatar

Yes I tested with files and it worked well. I did not test file based logic with large curl responses though. Perhaps I could utilize named pipes somehow, if I'm using them anyway in case of requests that have large bodies.

Append
I just made humongous progress solving this issue. I suspect there is a bug in the popen system call or perhaps in Linux kernel. Turns out that if I called popen in write mode instead of read mode then everything started working as expected.

More about it here:
https://forums.linuxmint.com/viewtopic.php?f=47&t=242879&p=1298492#p1298492

Append 2
Holy mother of god! I just tested and turns out that popen in read mode can also be fixed with a really simple trick. Here's the fixed command line:

{ printf '%s\n%s\n%s' 'su' 'log start' 'log ' ; printf "%s" "2d2d75726c2068747470733a2f2f626c6f636b636861696e2e696e666f2f726177626c6f636b2f3237373737373f666f726d61743d6865780a2d2d6d61782d74696d652031300a2d482022636f6e74656e742d747970653a20746578742f706c61696e3b220a" | xxd -p -r  | curl -s --config - | tail -c 10  ; printf '\nlog end\n%s\n%s\n' 'exit' 'exit' ; } | netcat -q 10 -w 10 127.0.0.1 4000 > /dev/null 2>/dev/null &

And here's the malfunctioning one:

{ printf '%s\n%s\n%s' 'su' 'log start' 'log ' ; printf "%s" "2d2d75726c2068747470733a2f2f626c6f636b636861696e2e696e666f2f726177626c6f636b2f3237373737373f666f726d61743d6865780a2d2d6d61782d74696d652031300a2d482022636f6e74656e742d747970653a20746578742f706c61696e3b220a" | xxd -p -r  | curl -s --config - | tail -c 10  ; printf '\nlog end\n%s\n%s\n' 'exit' 'exit' ; } | netcat -q 10 -w 10 127.0.0.1 4000 > &

As you can see redirection of stdout and stderr to /dev/null made it work?! ;D ???

Go to: