Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
358 views
in Technique[技术] by (71.8m points)

web scraping - How to edit this code so that I can view only texts?

This is a snippet of config file of snownews (a terminal based news aggregator).

Problem: When I try to view a rss feed on terminal, I can only view text till the image! after that everything is blank. Also since i'm using a terminal image is also not supported.

image: https://imgur.com/a/O3Gq2Dl

Here is the code were the web scraping takes place. I only need to view text without any images.

    # Importing
    if (($PROGRAM_NAME =~ "snow2opml") || ($ARGV[0] eq "--export")) {
            OPMLexport();
    } else {
            my $parser = XML::LibXML->new();
            $parser->validation(0);                         # Turn off validation from libxml
            $parser->recover(1);                            # And ignore any errors while parsi>
    
            my(@lines) = <>;
            my($input) = join ("
", @lines);
    
            my($doc) = $parser->parse_string($input);
            my($root) = $doc->documentElement();
    
            # Parsing the document tree using xpath
            my(@items) = $root->findnodes("//outline");
            foreach (@items) {
                    my(@attrs) = $_->attributes();
                    foreach (@attrs) {
                            # Only print attribute xmlUrl=""
                            if ($_->nodeName =~ /xmlUrl/i) {
                                    print $_->value."
";
                            }
                    }
            }
    }

If the full code is needed I can post it


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...