Limit pipe size and write to temporary file as a fallback

I have a somewhat tricky question. I want to pipe result from one command to another, but since I'm working on system with limited memory, I want to make sure that the pipe won't use too much of that. But I don't want it to break when it hits the limit, just switch to using the disk as a temporary file.

The use case is as follows. I download a large file using curl or wget. I pipe the result to another program (to named pipe, actually, but that named pipe is immediately fed to another command). If anythings goes well, the second command is able to consume the input faster than curl can output it (download is slower than the processing of the second command).

But sometimes things go wrong and the second command needs some time to start consuming it. It probably will, eventually, but since that second command needs a bit of RAM and I have that resource limited, I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.

That second command may even need more time to start than the time needed to download the file. In that case the downloaded file should be completely written to disk, so that the second process could consume it later.

Is there any solution for such a silly problem?

asked Jan 6 at 15:18

Anonymouse

111

add a comment |

Is there any solution for such a silly problem?

asked Jan 6 at 15:18

Anonymouse

111

add a comment |

Is there any solution for such a silly problem?

asked Jan 6 at 15:18

Anonymouse

111

Is there any solution for such a silly problem?

linux command-line bash memory pipe

asked Jan 6 at 15:18

Anonymouse

111

asked Jan 6 at 15:18

Anonymouse

111

asked Jan 6 at 15:18

Anonymouse

111

asked Jan 6 at 15:18

Anonymouse

111

asked Jan 6 at 15:18

Anonymouse

111

add a comment |

1 Answer
1

active

oldest

votes

Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file. The difference is in you idea to combine buffers:

I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.

Connect two buffers like this:

feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer

Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.

Notes:

It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.

In my tests the first mbuffer writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.

edited Jan 6 at 21:31

answered Jan 6 at 19:35

Kamil Maciorowski

26.4k155680

Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.

– Anonymouse
Jan 6 at 20:34

But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.

– Anonymouse
Jan 6 at 21:03

@Anonymouse Unfortunately my recent tests show your doubts are justified. I read /dev/zero via two mbuffer-s to /dev/null. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.

– Kamil Maciorowski
Jan 6 at 21:11

That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!

– Anonymouse
Jan 6 at 21:17

1

Great, we'll wait together then. :-)

– Anonymouse
Jan 6 at 21:24

|
show 1 more comment

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1391186%2flimit-pipe-size-and-write-to-temporary-file-as-a-fallback%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file. The difference is in you idea to combine buffers:

I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.

Connect two buffers like this:

feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer

Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.

Notes:

It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.

In my tests the first mbuffer writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.

edited Jan 6 at 21:31

answered Jan 6 at 19:35

Kamil Maciorowski

26.4k155680

Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.

– Anonymouse
Jan 6 at 20:34

But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.

– Anonymouse
Jan 6 at 21:03

@Anonymouse Unfortunately my recent tests show your doubts are justified. I read /dev/zero via two mbuffer-s to /dev/null. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.

– Kamil Maciorowski
Jan 6 at 21:11

That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!

– Anonymouse
Jan 6 at 21:17

1

Great, we'll wait together then. :-)

– Anonymouse
Jan 6 at 21:24

|
show 1 more comment

Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file. The difference is in you idea to combine buffers:

I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.

Connect two buffers like this:

feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer

Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.

Notes:

It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.

In my tests the first mbuffer writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.

edited Jan 6 at 21:31

answered Jan 6 at 19:35

Kamil Maciorowski

26.4k155680

Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.

– Anonymouse
Jan 6 at 20:34

But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.

– Anonymouse
Jan 6 at 21:03

@Anonymouse Unfortunately my recent tests show your doubts are justified. I read /dev/zero via two mbuffer-s to /dev/null. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.

– Kamil Maciorowski
Jan 6 at 21:11

That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!

– Anonymouse
Jan 6 at 21:17

1

Great, we'll wait together then. :-)

– Anonymouse
Jan 6 at 21:24

|
show 1 more comment

Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file. The difference is in you idea to combine buffers:

I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.

Connect two buffers like this:

feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer

Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.

Notes:

It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.

In my tests the first mbuffer writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.

edited Jan 6 at 21:31

answered Jan 6 at 19:35

Kamil Maciorowski

26.4k155680

Your question is similar to this other one: Use disk-backed buffer between pipes, where the answer is mbuffer -T /path/to/file. The difference is in you idea to combine buffers:

I want to switch to writing to disk if the pipe starts to use more than, let's say, 200 MB of RAM.

Connect two buffers like this:

feeder | mbuffer -T /path/to/file -m 2G | mbuffer -m 200M | consumer

Data will flow as far as it can, so the 200 MiB memory buffer will be filled first (if ever). Only then the 2 GiB disk-based buffer will start to hold data.

Notes:

It seems the file gets allocated in its full size at the very beginning. If there's no space left on the device, the whole pipe will fail early. This introduces some initial delay though.

In my tests the first mbuffer writes to the file even when the second one reads from the pipe immediately. I think the first one might pass the data without touching the file; but it's not the case and the entire pipe is limited by the disk speed even if the disk-based buffer is not in use at all.

edited Jan 6 at 21:31

answered Jan 6 at 19:35

Kamil Maciorowski

26.4k155680

edited Jan 6 at 21:31

answered Jan 6 at 19:35

Kamil Maciorowski

26.4k155680

answered Jan 6 at 19:35

Kamil Maciorowski

26.4k155680

answered Jan 6 at 19:35

Kamil Maciorowski

26.4k155680

Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.

– Anonymouse
Jan 6 at 20:34

But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.

– Anonymouse
Jan 6 at 21:03

@Anonymouse Unfortunately my recent tests show your doubts are justified. I read /dev/zero via two mbuffer-s to /dev/null. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.

– Kamil Maciorowski
Jan 6 at 21:11

That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!

– Anonymouse
Jan 6 at 21:17

1

Great, we'll wait together then. :-)

– Anonymouse
Jan 6 at 21:24

|
show 1 more comment

Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.

– Anonymouse
Jan 6 at 20:34

But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.

– Anonymouse
Jan 6 at 21:03

@Anonymouse Unfortunately my recent tests show your doubts are justified. I read /dev/zero via two mbuffer-s to /dev/null. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.

– Kamil Maciorowski
Jan 6 at 21:11

That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!

– Anonymouse
Jan 6 at 21:17

1

Great, we'll wait together then. :-)

– Anonymouse
Jan 6 at 21:24

Won't that always write to the disk, even for a brief while, since the disk-based buffer is between feeder and the memory-based one? Or maybe mbuffer only writes to disk if nothing receives from it? I don't completely understand how this works.

– Anonymouse
Jan 6 at 20:34

But maybe it doesn't always write to disk. “While ZFS receive can’t receive, mbuffer buffers, when zfs receive can receive, mbuffer sends it data as fast as it can” – that's what I found here. If mbuffers has a way to know if something receives from it, it could behave that way. Maybe someone else here knows anything about this? What else bothers me is using two pipes, I wonder if it affects RAM usage considerably.

– Anonymouse
Jan 6 at 21:03

@Anonymouse Unfortunately my recent tests show your doubts are justified. I read /dev/zero via two mbuffer-s to /dev/null. With two memory-based buffers it was fast. With a disk-based buffer and a memory-based one, the throughput was limited and it matched the speed of the disk.

– Kamil Maciorowski
Jan 6 at 21:11

That's unfortunate. But what may help in my case is that the sending process will be usually slower than disk speed, so the reading process will have time to catch-up. I would still leave that answer as it may help someone. I would accept yours, since it really could be helpful in my case, but I don't want to discourage others from posting something that could be better. Anyway, thanks!

– Anonymouse
Jan 6 at 21:17

Great, we'll wait together then. :-)

– Anonymouse
Jan 6 at 21:24

|
show 1 more comment

draft saved

draft discarded

Thanks for contributing an answer to Super User!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

IyJw3 s5btv0qyO 3LWIha s22135uK5sbuKZc88 EgApRc75qppZ5Xeh

搜尋此網誌

Csdrhrt