C# help with code im working on (read from external file weblink hybrid) - Overclock.net - An Overclocking Community

Forum Jump: 

C# help with code im working on (read from external file weblink hybrid)

 
Thread Tools
post #1 of 4 (permalink) Old 05-07-2016, 10:50 PM - Thread Starter
New to Overclock.net
 
MasterKH's Avatar
 
Join Date: Mar 2015
Posts: 85
Rep: 6 (Unique: 6)
Good afternoon guys,

This is the first time I post in this forum, I'm normally in Security, but Need some help with this code.

First let me explain what I'm trying to do:

I'm basically trying to go to a website and gather basic data from a webpage and then output it as required. Currently using listboxes but would like list box plus csv.

Currently here is what I have (its truncated for easier reading)
#note im using Visual Studio Community


1) Added the following "using statements"
Code:
using System.Net;
using System.IO;
using System.Text.RegularExpressions;

In my script I have something like this:
Code:
        WebClient web = new WebClient();
            String html = web.DownloadString ("http://www.ipvoid.com/scan/8.8.8.8/");
            MatchCollection m1 = Regex.Matches(html, @"IP Address.*<strong>(.+?)<\/strong>", RegexOptions.Singleline);


#Note this part has me concerned of resources (aka i think it might be doing a loop but it is what i found on the internet see link bellow#

         foreach (Match m in m1)
 {
                string IP = m.Groups[1].Value;
                IPs.Add(IP);
}

      listBox1.DataSource = IPs;
}


Now what I would like to do that I'm having issues with due to lack of practice or complexity in the troubleshooting to get it to work is the following:

1) my current setup works only for 1 webpage.

I would like to have an txt and feed that in a loop EG:
Code:
String html = web.DownloadString ("http://www.ipvoid.com/scan/"variablefromtxt"/");
so if my txt has 1, 2, 3, 4 etc it will look like
http://www.ipvoid.com/scan/1/
then gets the IP
http://www.ipvoid.com/scan/2/
then gets the IP
etc until there are no longer values in the txt.

2) not sure if theres a more efficient way to do this since values are not repeated like in the video file i used as reference EG: the whole "foreach" statemnt might be irrelevant for this code

other thing i would like it to do is to export to a CSV but it is not as critical as the other thing smile.gif

I'm using this youtube video as reference:
https://www.youtube.com/watch?v=rru3G7PLVjw


*Note: im currently gathering several inputs not just 1. This is not school work, homework, nor a project, it just me learning and trying to use it for real life examples.
MasterKH is offline  
Sponsored Links
Advertisement
 
post #2 of 4 (permalink) Old 05-10-2016, 06:54 AM
New to Overclock.net
 
Mrzev's Avatar
 
Join Date: Feb 2008
Location: Texas
Posts: 2,258
Rep: 96 (Unique: 76)
Hope this helps. My Regex is terrible =(
Code:
StreamReader reader = new StreamReader(File.OpenRead(txtFileLocation));
string line;
while((line = reader.ReadLine()) != null)
{
        WebClient web = new WebClient();
        string html = web.DownloadString ("http://www.ipvoid.com/scan/"+line +"/");
    //Slight change to ensure multiple instances will get caught.
        MatchCollection m1 = Regex.Matches(html, @"<tr><td>IP Address<\/td><td><strong>(.+?)<\/strong>");

        //This is a loop.  Basically, m1 comes back as a list for each time it matches this sequence.   
    // <tr><td>IP Address<\/td><td><strong>8.8.8.8<\/strong> is the first group (0)  , and 8.8.8.8 is the 2nd group (1)
    //If there is <tr><td>IP Address<\/td><td><strong>8.8.8.9<\/strong>, that would happen on the 2nd iteration of the foreach.
        foreach (Match m in m1)
        {
                string IP = m.Groups[1].Value;
                IPs.Add(IP);
        }
}
listBox1.DataSource = IPs;




Mrzev is offline  
post #3 of 4 (permalink) Old 05-21-2016, 08:23 PM - Thread Starter
New to Overclock.net
 
MasterKH's Avatar
 
Join Date: Mar 2015
Posts: 85
Rep: 6 (Unique: 6)
@mrzev

I'm going to give it a try. Sorry for the delay response I had changed the pq on my mail and forgot to update it on the phone.

I ended up installing Python to be honest. Since many of the help guides found where not working in C# .
MasterKH is offline  
Sponsored Links
Advertisement
 
post #4 of 4 (permalink) Old 06-24-2016, 06:14 AM
Some call me... Bifford
 
BFRD's Avatar
 
Join Date: Dec 2004
Location: Carrollton, TX
Posts: 5,264
I haven't really looked much at the Regex of the example above, but I do see a couple of things that shouldn't be left up to the garbage collector. Both StreamReader and WebClient implement IDisposable, this means that they should be destroyed when finished. The easiest way to do that with any object that requires disposal is the following.
Code:
using(var reader = new StreamReader("path"))
{
    //Code here
}


Even if you return something out of the using statement the object will still be disposed of properly.

The second thing I notice is that you are creating the webclient inside of the loop. That means that multiple instances of the object will be used which is not necessary. Since both can be disposed of at the same time the code then turns into:
Code:
using(var reader = new StreamReader("path"))
using(var webClient = new WebClient())
{
    //Code here
}

Notice that you do not have to nest the using statements, just stack them and use one set of braces.

The other simplification that I did was using "var" instead of the actual type. Using var is fine inside of a method. If you are declaring something globally you must use the type.


If you are interested in a slightly different method using some lambda expressions and File.ReadAllLines(). Check this out. The ReadAllLines relieves you of having to open and close a reader object. It returns an IEnumerable list of strings split on line breaks.I had to declare a couple of IEnumerable extensions. It is hard to stop using the "Each" once you get used to having it. We use it a ton here. In the example below I left in the standard foreach loop. That could have been replaced with items.Each(I => {}); For debugging purposes it is sometimes easier to keep the regular loop until your code is solid. The text file is just an IP address on a single line.
Code:
void Main()
{
        var IPs = new List<string>();
        var path = @"<path>source_ips.txt";
        var items = File.ReadAllLines(path);
        var re = new Regex(@"<tr><td>IP Address<\/td><td><strong>(.+?)<\/strong>");
        using(var webClient = new WebClient())
        {
                foreach (var l in items)
                {
                        var html = webClient.DownloadString(string.Format(@"http://www.ipvoid.com/scan/{0}/",l));
                        var m1 = re.Matches(html).Cast<Match>();
                        m1.Each(m => IPs.Add(m.Groups[1].Value));
                }
        }
}

public static class SystemCollectionsGenericIEnumerable
{
        public static void Each<T>(this IEnumerable<T> ie, Action<T> action)
        {
                if (action == null) throw new ArgumentNullException("action");
                Each(ie, (x, i) => action(x));
        }

        public static void Each<T>(this IEnumerable<T> ie, Action<T, int> action)
        {
                if (ie == null) return;
                if (action == null) throw new ArgumentNullException("action");

                var i = 0;
                foreach (var e in ie) action(e, i++);
        }
}



BFRD is offline  
Reply

Quick Reply
Message:
Options

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off