Get HTML Code from a website after it completed loading
I am trying to get the HTML Code from a specific website async with the following code:
var response = await httpClient.GetStringAsync("url");
But the problem is that the website usually takes another second to load the other parts of it. Which I need, so the question is if I can load the site first and read the content after a certain amount of time.
Sorry if this question already got answered, but I didn't really know what to search for.
Thanks,
Twenty
Edit #1
If you want to try it yourself the URL is http://iloveradio.de/iloveradio/
, I need the Title and the Artist which do not immediately load.
c# web-scraping dotnet-httpclient
add a comment |
I am trying to get the HTML Code from a specific website async with the following code:
var response = await httpClient.GetStringAsync("url");
But the problem is that the website usually takes another second to load the other parts of it. Which I need, so the question is if I can load the site first and read the content after a certain amount of time.
Sorry if this question already got answered, but I didn't really know what to search for.
Thanks,
Twenty
Edit #1
If you want to try it yourself the URL is http://iloveradio.de/iloveradio/
, I need the Title and the Artist which do not immediately load.
c# web-scraping dotnet-httpclient
add a comment |
I am trying to get the HTML Code from a specific website async with the following code:
var response = await httpClient.GetStringAsync("url");
But the problem is that the website usually takes another second to load the other parts of it. Which I need, so the question is if I can load the site first and read the content after a certain amount of time.
Sorry if this question already got answered, but I didn't really know what to search for.
Thanks,
Twenty
Edit #1
If you want to try it yourself the URL is http://iloveradio.de/iloveradio/
, I need the Title and the Artist which do not immediately load.
c# web-scraping dotnet-httpclient
I am trying to get the HTML Code from a specific website async with the following code:
var response = await httpClient.GetStringAsync("url");
But the problem is that the website usually takes another second to load the other parts of it. Which I need, so the question is if I can load the site first and read the content after a certain amount of time.
Sorry if this question already got answered, but I didn't really know what to search for.
Thanks,
Twenty
Edit #1
If you want to try it yourself the URL is http://iloveradio.de/iloveradio/
, I need the Title and the Artist which do not immediately load.
c# web-scraping dotnet-httpclient
c# web-scraping dotnet-httpclient
asked Dec 22 '18 at 19:10
TwentyTwenty
39615
39615
add a comment |
add a comment |
5 Answers
5
active
oldest
votes
You are on the wrong direction. The referenced site has playlist api which returns json. you can get information from :
http://iloveradio.de/typo3conf/ext/ep_channel/Scripts/playlist.php
Edit: Chome Inspector is used to find out Playlist link
Thanks a lot you saved my day, how did you find that one?
– Twenty
Dec 22 '18 at 19:31
If this answer is useful, please consider to tick it as answer. thanks
– Simonare
Dec 22 '18 at 19:37
Still how did you found out?
– Twenty
Dec 22 '18 at 19:39
Check my answer again please
– Simonare
Dec 22 '18 at 19:41
add a comment |
You could use Puppeteer-Sharp:
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = false }))
using (var page = await browser.NewPageAsync())
{
await page.SetViewportAsync(new ViewPortOptions() { Width = 1280, Height = 600 });
await page.GoToAsync("http://iloveradio.de/iloveradio/");
await page.WaitForSelectorAsync("#artisttitle DIV");
var artist = await page.EvaluateExpressionAsync<string>("$('#artisttitle DIV')[0].innerText");
Console.WriteLine(artist);
Console.ReadLine();
}
add a comment |
If there are things that load after, it means that they are generated by javascript code after page load (an ajax request for example), so no matter how long you wait, it won't have the content you want (because they are not in the source code when it loads).
Easy way to do it:
Use a WebBrowser
and when DocumentCompleated
event triggers wait till the element you want appears.
The Right Way:
find the javascript yourself and trigger it yourself (easy to say, hard to do).
add a comment |
The thing to understand here is that when you read the response from the URL, all you will ever get is the raw response, in this case the HTML source code the server replied with.
Unlike what you might see in your browser's DOM Inspector developer tools, you will only get the original HTML source from the page (what you might see in the "Page Source" developer tool) which will not include any dynamically created content (JavaScript) or loaded content (like iframes).
So you aren't getting what you see here in the DOM Inspector:
You are getting what you see here in the Page Source (View > Developer > View Source in Chrome):
You can't wait for that other content to load because it will never load since that HTML content isn't being parsed or rendered like a browser would.
You have several options available though:
- See if the website has an API you can use
- Determine where that content you want is actually loaded from, and make another/different HTTP request to that content (the Network Panel is helpful here)
- Use a headless browser to programmatically load the page and dynamically read the page contents (this will add a lot of overhead, and should probably be avoided if possible)
add a comment |
I have checked out the website, data is loaded by javascript. You only can get the html using httpClient.GetStringAsync("url");
.
As far as I know, there is no luck to get the elements what is manipulate by browser.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53898592%2fget-html-code-from-a-website-after-it-completed-loading%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
You are on the wrong direction. The referenced site has playlist api which returns json. you can get information from :
http://iloveradio.de/typo3conf/ext/ep_channel/Scripts/playlist.php
Edit: Chome Inspector is used to find out Playlist link
Thanks a lot you saved my day, how did you find that one?
– Twenty
Dec 22 '18 at 19:31
If this answer is useful, please consider to tick it as answer. thanks
– Simonare
Dec 22 '18 at 19:37
Still how did you found out?
– Twenty
Dec 22 '18 at 19:39
Check my answer again please
– Simonare
Dec 22 '18 at 19:41
add a comment |
You are on the wrong direction. The referenced site has playlist api which returns json. you can get information from :
http://iloveradio.de/typo3conf/ext/ep_channel/Scripts/playlist.php
Edit: Chome Inspector is used to find out Playlist link
Thanks a lot you saved my day, how did you find that one?
– Twenty
Dec 22 '18 at 19:31
If this answer is useful, please consider to tick it as answer. thanks
– Simonare
Dec 22 '18 at 19:37
Still how did you found out?
– Twenty
Dec 22 '18 at 19:39
Check my answer again please
– Simonare
Dec 22 '18 at 19:41
add a comment |
You are on the wrong direction. The referenced site has playlist api which returns json. you can get information from :
http://iloveradio.de/typo3conf/ext/ep_channel/Scripts/playlist.php
Edit: Chome Inspector is used to find out Playlist link
You are on the wrong direction. The referenced site has playlist api which returns json. you can get information from :
http://iloveradio.de/typo3conf/ext/ep_channel/Scripts/playlist.php
Edit: Chome Inspector is used to find out Playlist link
edited Dec 22 '18 at 19:41
answered Dec 22 '18 at 19:29
SimonareSimonare
8,41511537
8,41511537
Thanks a lot you saved my day, how did you find that one?
– Twenty
Dec 22 '18 at 19:31
If this answer is useful, please consider to tick it as answer. thanks
– Simonare
Dec 22 '18 at 19:37
Still how did you found out?
– Twenty
Dec 22 '18 at 19:39
Check my answer again please
– Simonare
Dec 22 '18 at 19:41
add a comment |
Thanks a lot you saved my day, how did you find that one?
– Twenty
Dec 22 '18 at 19:31
If this answer is useful, please consider to tick it as answer. thanks
– Simonare
Dec 22 '18 at 19:37
Still how did you found out?
– Twenty
Dec 22 '18 at 19:39
Check my answer again please
– Simonare
Dec 22 '18 at 19:41
Thanks a lot you saved my day, how did you find that one?
– Twenty
Dec 22 '18 at 19:31
Thanks a lot you saved my day, how did you find that one?
– Twenty
Dec 22 '18 at 19:31
If this answer is useful, please consider to tick it as answer. thanks
– Simonare
Dec 22 '18 at 19:37
If this answer is useful, please consider to tick it as answer. thanks
– Simonare
Dec 22 '18 at 19:37
Still how did you found out?
– Twenty
Dec 22 '18 at 19:39
Still how did you found out?
– Twenty
Dec 22 '18 at 19:39
Check my answer again please
– Simonare
Dec 22 '18 at 19:41
Check my answer again please
– Simonare
Dec 22 '18 at 19:41
add a comment |
You could use Puppeteer-Sharp:
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = false }))
using (var page = await browser.NewPageAsync())
{
await page.SetViewportAsync(new ViewPortOptions() { Width = 1280, Height = 600 });
await page.GoToAsync("http://iloveradio.de/iloveradio/");
await page.WaitForSelectorAsync("#artisttitle DIV");
var artist = await page.EvaluateExpressionAsync<string>("$('#artisttitle DIV')[0].innerText");
Console.WriteLine(artist);
Console.ReadLine();
}
add a comment |
You could use Puppeteer-Sharp:
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = false }))
using (var page = await browser.NewPageAsync())
{
await page.SetViewportAsync(new ViewPortOptions() { Width = 1280, Height = 600 });
await page.GoToAsync("http://iloveradio.de/iloveradio/");
await page.WaitForSelectorAsync("#artisttitle DIV");
var artist = await page.EvaluateExpressionAsync<string>("$('#artisttitle DIV')[0].innerText");
Console.WriteLine(artist);
Console.ReadLine();
}
add a comment |
You could use Puppeteer-Sharp:
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = false }))
using (var page = await browser.NewPageAsync())
{
await page.SetViewportAsync(new ViewPortOptions() { Width = 1280, Height = 600 });
await page.GoToAsync("http://iloveradio.de/iloveradio/");
await page.WaitForSelectorAsync("#artisttitle DIV");
var artist = await page.EvaluateExpressionAsync<string>("$('#artisttitle DIV')[0].innerText");
Console.WriteLine(artist);
Console.ReadLine();
}
You could use Puppeteer-Sharp:
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);
using (var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = false }))
using (var page = await browser.NewPageAsync())
{
await page.SetViewportAsync(new ViewPortOptions() { Width = 1280, Height = 600 });
await page.GoToAsync("http://iloveradio.de/iloveradio/");
await page.WaitForSelectorAsync("#artisttitle DIV");
var artist = await page.EvaluateExpressionAsync<string>("$('#artisttitle DIV')[0].innerText");
Console.WriteLine(artist);
Console.ReadLine();
}
answered Dec 22 '18 at 19:21
hardkodedhardkoded
5,45521828
5,45521828
add a comment |
add a comment |
If there are things that load after, it means that they are generated by javascript code after page load (an ajax request for example), so no matter how long you wait, it won't have the content you want (because they are not in the source code when it loads).
Easy way to do it:
Use a WebBrowser
and when DocumentCompleated
event triggers wait till the element you want appears.
The Right Way:
find the javascript yourself and trigger it yourself (easy to say, hard to do).
add a comment |
If there are things that load after, it means that they are generated by javascript code after page load (an ajax request for example), so no matter how long you wait, it won't have the content you want (because they are not in the source code when it loads).
Easy way to do it:
Use a WebBrowser
and when DocumentCompleated
event triggers wait till the element you want appears.
The Right Way:
find the javascript yourself and trigger it yourself (easy to say, hard to do).
add a comment |
If there are things that load after, it means that they are generated by javascript code after page load (an ajax request for example), so no matter how long you wait, it won't have the content you want (because they are not in the source code when it loads).
Easy way to do it:
Use a WebBrowser
and when DocumentCompleated
event triggers wait till the element you want appears.
The Right Way:
find the javascript yourself and trigger it yourself (easy to say, hard to do).
If there are things that load after, it means that they are generated by javascript code after page load (an ajax request for example), so no matter how long you wait, it won't have the content you want (because they are not in the source code when it loads).
Easy way to do it:
Use a WebBrowser
and when DocumentCompleated
event triggers wait till the element you want appears.
The Right Way:
find the javascript yourself and trigger it yourself (easy to say, hard to do).
answered Dec 22 '18 at 19:23
Ashkan Mobayen KhiabaniAshkan Mobayen Khiabani
20.4k1565114
20.4k1565114
add a comment |
add a comment |
The thing to understand here is that when you read the response from the URL, all you will ever get is the raw response, in this case the HTML source code the server replied with.
Unlike what you might see in your browser's DOM Inspector developer tools, you will only get the original HTML source from the page (what you might see in the "Page Source" developer tool) which will not include any dynamically created content (JavaScript) or loaded content (like iframes).
So you aren't getting what you see here in the DOM Inspector:
You are getting what you see here in the Page Source (View > Developer > View Source in Chrome):
You can't wait for that other content to load because it will never load since that HTML content isn't being parsed or rendered like a browser would.
You have several options available though:
- See if the website has an API you can use
- Determine where that content you want is actually loaded from, and make another/different HTTP request to that content (the Network Panel is helpful here)
- Use a headless browser to programmatically load the page and dynamically read the page contents (this will add a lot of overhead, and should probably be avoided if possible)
add a comment |
The thing to understand here is that when you read the response from the URL, all you will ever get is the raw response, in this case the HTML source code the server replied with.
Unlike what you might see in your browser's DOM Inspector developer tools, you will only get the original HTML source from the page (what you might see in the "Page Source" developer tool) which will not include any dynamically created content (JavaScript) or loaded content (like iframes).
So you aren't getting what you see here in the DOM Inspector:
You are getting what you see here in the Page Source (View > Developer > View Source in Chrome):
You can't wait for that other content to load because it will never load since that HTML content isn't being parsed or rendered like a browser would.
You have several options available though:
- See if the website has an API you can use
- Determine where that content you want is actually loaded from, and make another/different HTTP request to that content (the Network Panel is helpful here)
- Use a headless browser to programmatically load the page and dynamically read the page contents (this will add a lot of overhead, and should probably be avoided if possible)
add a comment |
The thing to understand here is that when you read the response from the URL, all you will ever get is the raw response, in this case the HTML source code the server replied with.
Unlike what you might see in your browser's DOM Inspector developer tools, you will only get the original HTML source from the page (what you might see in the "Page Source" developer tool) which will not include any dynamically created content (JavaScript) or loaded content (like iframes).
So you aren't getting what you see here in the DOM Inspector:
You are getting what you see here in the Page Source (View > Developer > View Source in Chrome):
You can't wait for that other content to load because it will never load since that HTML content isn't being parsed or rendered like a browser would.
You have several options available though:
- See if the website has an API you can use
- Determine where that content you want is actually loaded from, and make another/different HTTP request to that content (the Network Panel is helpful here)
- Use a headless browser to programmatically load the page and dynamically read the page contents (this will add a lot of overhead, and should probably be avoided if possible)
The thing to understand here is that when you read the response from the URL, all you will ever get is the raw response, in this case the HTML source code the server replied with.
Unlike what you might see in your browser's DOM Inspector developer tools, you will only get the original HTML source from the page (what you might see in the "Page Source" developer tool) which will not include any dynamically created content (JavaScript) or loaded content (like iframes).
So you aren't getting what you see here in the DOM Inspector:
You are getting what you see here in the Page Source (View > Developer > View Source in Chrome):
You can't wait for that other content to load because it will never load since that HTML content isn't being parsed or rendered like a browser would.
You have several options available though:
- See if the website has an API you can use
- Determine where that content you want is actually loaded from, and make another/different HTTP request to that content (the Network Panel is helpful here)
- Use a headless browser to programmatically load the page and dynamically read the page contents (this will add a lot of overhead, and should probably be avoided if possible)
edited Dec 23 '18 at 0:15
answered Dec 22 '18 at 23:47
Alexander O'MaraAlexander O'Mara
43.5k1399129
43.5k1399129
add a comment |
add a comment |
I have checked out the website, data is loaded by javascript. You only can get the html using httpClient.GetStringAsync("url");
.
As far as I know, there is no luck to get the elements what is manipulate by browser.
add a comment |
I have checked out the website, data is loaded by javascript. You only can get the html using httpClient.GetStringAsync("url");
.
As far as I know, there is no luck to get the elements what is manipulate by browser.
add a comment |
I have checked out the website, data is loaded by javascript. You only can get the html using httpClient.GetStringAsync("url");
.
As far as I know, there is no luck to get the elements what is manipulate by browser.
I have checked out the website, data is loaded by javascript. You only can get the html using httpClient.GetStringAsync("url");
.
As far as I know, there is no luck to get the elements what is manipulate by browser.
answered Dec 22 '18 at 19:22
FagunFagun
1412
1412
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53898592%2fget-html-code-from-a-website-after-it-completed-loading%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown