java – Selenium – driver.getPageSource()与从浏览器中查看的源不同

我试图使用selenium从指定的 HTML 文件中捕获源代码,但我不知道为什么,我没有得到我们从浏览器中看到的确切源代码.

下面是我在Java 文件中捕获源代码的java代码

private static void getHTMLSourceFromURL(String url,String fileName) {
 
    WebDriver driver = new FirefoxDriver();
    driver.get(url);
 
    try {
        Thread.sleep(5000);   //the page gets loaded completely
 
        List<String> pageSource = new ArrayList<String>(Arrays.asList(driver.getPageSource().split("\n")));
 
        writeTextToFile(pageSource,originalFile);
 
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
 
    System.out.println("quitting webdriver");
    driver.quit();
}
 
/**
 * creates file with fileName and writes the content
 * 
 * @param content
 * @param fileName
 */
private static void writeTextToFile(List<String> content,String fileName) {
    PrintWriter pw = null;
    String outputFolder = ".";
    File output = null;
    try {
        File dir = new File(outputFolder + '/' + "HTML Sources");
        if (!dir.exists()) {
            boolean success = dir.mkdirs();
            if (success == false) {
                try {
                    throw new Exception(dir + " could not be created");
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        }
 
        output = new File(dir + "/" + fileName);
        if (!output.exists()) {
            try {
                output.createNewFile();
            } catch (IOException ioe) {
                ioe.printStackTrace();
            }
        }
        pw = new PrintWriter(new FileWriter(output,true));
        for (String line : content) {
            pw.print(line);
            pw.print("\n");
        }
    } catch (IOException ioe) {
        ioe.printStackTrace();
    } finally {
        pw.close();
    }
 
}

有人可以为此解释为什么会发生这种情况吗？ WebDriver如何呈现页面？浏览器如何显示源代码？

解决方法

有几个地方你可以从中获取来源.你可以试试

String pageSource=driver.findElement(By.tagName("body")).getText();

看看会出现什么.

通常,您不需要等待页面加载.Selenium会自动执行此操作,除非您有单独的Javascript / Ajax部分.

您可能想要添加您所看到的差异,以便我们了解您的真正含义.

Webdriver不会自己呈现页面,它只是在浏览器看到它时呈现它.

java – Selenium – driver.getPageSource()与从浏览器中查看的源不同

解决方法

猜你在找的Java相关文章