Limit pipe size and write to temporary file as a fallback
I have a somewhat tricky question. I want to pipe result from one command to another, but since I'm working on system with limited memory, I want to make sure that the pipe won't use too much of that. But I don't want it to break when it hits the limit, just switch to using the disk as a temporary file.
The use case is as follows. I download a large file using curl or wget. I pipe the result to another program (to named pipe, actually, but that named pipe is immediately fed to another command). If anythings goes well, the second command is able to consume the input faster than curl can output it (download is slower than the processing of the second command).
But sometimes things go wrong and the second command needs some time to start consuming it. It probably will, eventually, but since that second command needs a bit of RAM and I have that resource limited, I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
That second command may even need more time to start than the time needed to download the file. In that case the downloaded file should be completely written to disk, so that the second process could consume it later.
Is there any solution for such a silly problem?
linux command-line bash memory pipe
add a comment |
I have a somewhat tricky question. I want to pipe result from one command to another, but since I'm working on system with limited memory, I want to make sure that the pipe won't use too much of that. But I don't want it to break when it hits the limit, just switch to using the disk as a temporary file.
The use case is as follows. I download a large file using curl or wget. I pipe the result to another program (to named pipe, actually, but that named pipe is immediately fed to another command). If anythings goes well, the second command is able to consume the input faster than curl can output it (download is slower than the processing of the second command).
But sometimes things go wrong and the second command needs some time to start consuming it. It probably will, eventually, but since that second command needs a bit of RAM and I have that resource limited, I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
That second command may even need more time to start than the time needed to download the file. In that case the downloaded file should be completely written to disk, so that the second process could consume it later.
Is there any solution for such a silly problem?
linux command-line bash memory pipe
add a comment |
I have a somewhat tricky question. I want to pipe result from one command to another, but since I'm working on system with limited memory, I want to make sure that the pipe won't use too much of that. But I don't want it to break when it hits the limit, just switch to using the disk as a temporary file.
The use case is as follows. I download a large file using curl or wget. I pipe the result to another program (to named pipe, actually, but that named pipe is immediately fed to another command). If anythings goes well, the second command is able to consume the input faster than curl can output it (download is slower than the processing of the second command).
But sometimes things go wrong and the second command needs some time to start consuming it. It probably will, eventually, but since that second command needs a bit of RAM and I have that resource limited, I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
That second command may even need more time to start than the time needed to download the file. In that case the downloaded file should be completely written to disk, so that the second process could consume it later.
Is there any solution for such a silly problem?
linux command-line bash memory pipe
I have a somewhat tricky question. I want to pipe result from one command to another, but since I'm working on system with limited memory, I want to make sure that the pipe won't use too much of that. But I don't want it to break when it hits the limit, just switch to using the disk as a temporary file.
The use case is as follows. I download a large file using curl or wget. I pipe the result to another program (to named pipe, actually, but that named pipe is immediately fed to another command). If anythings goes well, the second command is able to consume the input faster than curl can output it (download is slower than the processing of the second command).
But sometimes things go wrong and the second command needs some time to start consuming it. It probably will, eventually, but since that second command needs a bit of RAM and I have that resource limited, I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
That second command may even need more time to start than the time needed to download the file. In that case the downloaded file should be completely written to disk, so that the second process could consume it later.
Is there any solution for such a silly problem?
linux command-line bash memory pipe
linux command-line bash memory pipe
asked Jan 6 at 15:18
AnonymouseAnonymouse
111
111
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file
. The difference is in you idea to combine buffers:
I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
Connect two buffers like this:
feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer
Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.
Notes:
It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.
In my tests the first
mbuffer
writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.
Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.
– Anonymouse
Jan 6 at 20:34
But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.
– Anonymouse
Jan 6 at 21:03
@Anonymouse Unfortunately my recent tests show your doubts are justified. I read/dev/zero
via twombuffer
-s to/dev/null
. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.
– Kamil Maciorowski
Jan 6 at 21:11
That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!
– Anonymouse
Jan 6 at 21:17
1
Great, we'll wait together then. :-)
– Anonymouse
Jan 6 at 21:24
|
show 1 more comment
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1391186%2flimit-pipe-size-and-write-to-temporary-file-as-a-fallback%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file
. The difference is in you idea to combine buffers:
I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
Connect two buffers like this:
feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer
Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.
Notes:
It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.
In my tests the first
mbuffer
writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.
Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.
– Anonymouse
Jan 6 at 20:34
But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.
– Anonymouse
Jan 6 at 21:03
@Anonymouse Unfortunately my recent tests show your doubts are justified. I read/dev/zero
via twombuffer
-s to/dev/null
. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.
– Kamil Maciorowski
Jan 6 at 21:11
That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!
– Anonymouse
Jan 6 at 21:17
1
Great, we'll wait together then. :-)
– Anonymouse
Jan 6 at 21:24
|
show 1 more comment
Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file
. The difference is in you idea to combine buffers:
I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
Connect two buffers like this:
feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer
Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.
Notes:
It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.
In my tests the first
mbuffer
writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.
Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.
– Anonymouse
Jan 6 at 20:34
But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.
– Anonymouse
Jan 6 at 21:03
@Anonymouse Unfortunately my recent tests show your doubts are justified. I read/dev/zero
via twombuffer
-s to/dev/null
. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.
– Kamil Maciorowski
Jan 6 at 21:11
That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!
– Anonymouse
Jan 6 at 21:17
1
Great, we'll wait together then. :-)
– Anonymouse
Jan 6 at 21:24
|
show 1 more comment
Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file
. The difference is in you idea to combine buffers:
I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
Connect two buffers like this:
feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer
Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.
Notes:
It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.
In my tests the first
mbuffer
writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.
Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file
. The difference is in you idea to combine buffers:
I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.
Connect two buffers like this:
feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer
Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.
Notes:
It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.
In my tests the first
mbuffer
writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.
edited Jan 6 at 21:31
answered Jan 6 at 19:35
Kamil MaciorowskiKamil Maciorowski
26.4k155680
26.4k155680
Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.
– Anonymouse
Jan 6 at 20:34
But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.
– Anonymouse
Jan 6 at 21:03
@Anonymouse Unfortunately my recent tests show your doubts are justified. I read/dev/zero
via twombuffer
-s to/dev/null
. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.
– Kamil Maciorowski
Jan 6 at 21:11
That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!
– Anonymouse
Jan 6 at 21:17
1
Great, we'll wait together then. :-)
– Anonymouse
Jan 6 at 21:24
|
show 1 more comment
Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.
– Anonymouse
Jan 6 at 20:34
But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.
– Anonymouse
Jan 6 at 21:03
@Anonymouse Unfortunately my recent tests show your doubts are justified. I read/dev/zero
via twombuffer
-s to/dev/null
. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.
– Kamil Maciorowski
Jan 6 at 21:11
That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!
– Anonymouse
Jan 6 at 21:17
1
Great, we'll wait together then. :-)
– Anonymouse
Jan 6 at 21:24
Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.
– Anonymouse
Jan 6 at 20:34
Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.
– Anonymouse
Jan 6 at 20:34
But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.
– Anonymouse
Jan 6 at 21:03
But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.
– Anonymouse
Jan 6 at 21:03
@Anonymouse Unfortunately my recent tests show your doubts are justified. I read
/dev/zero
via two mbuffer
-s to /dev/null
. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.– Kamil Maciorowski
Jan 6 at 21:11
@Anonymouse Unfortunately my recent tests show your doubts are justified. I read
/dev/zero
via two mbuffer
-s to /dev/null
. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.– Kamil Maciorowski
Jan 6 at 21:11
That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!
– Anonymouse
Jan 6 at 21:17
That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!
– Anonymouse
Jan 6 at 21:17
1
1
Great, we'll wait together then. :-)
– Anonymouse
Jan 6 at 21:24
Great, we'll wait together then. :-)
– Anonymouse
Jan 6 at 21:24
|
show 1 more comment
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1391186%2flimit-pipe-size-and-write-to-temporary-file-as-a-fallback%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown