Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › C# help with code im working on (read from external file weblink hybrid)
New Posts  All Forums:Forum Nav:

C# help with code im working on (read from external file weblink hybrid)

post #1 of 4
Thread Starter 
Good afternoon guys,

This is the first time I post in this forum, I'm normally in Security, but Need some help with this code.

First let me explain what I'm trying to do:

I'm basically trying to go to a website and gather basic data from a webpage and then output it as required. Currently using listboxes but would like list box plus csv.

Currently here is what I have (its truncated for easier reading)
#note im using Visual Studio Community


1) Added the following "using statements"
Code:
using System.Net;
using System.IO;
using System.Text.RegularExpressions;

In my script I have something like this:
Code:
        WebClient web = new WebClient();
            String html = web.DownloadString ("http://www.ipvoid.com/scan/8.8.8.8/");
            MatchCollection m1 = Regex.Matches(html, @"IP Address.*<strong>(.+?)<\/strong>", RegexOptions.Singleline);


#Note this part has me concerned of resources (aka i think it might be doing a loop but it is what i found on the internet see link bellow#

         foreach (Match m in m1)
 {
                string IP = m.Groups[1].Value;
                IPs.Add(IP);
}

      listBox1.DataSource = IPs;
}


Now what I would like to do that I'm having issues with due to lack of practice or complexity in the troubleshooting to get it to work is the following:

1) my current setup works only for 1 webpage.

I would like to have an txt and feed that in a loop EG:
Code:
String html = web.DownloadString ("http://www.ipvoid.com/scan/"variablefromtxt"/");
so if my txt has 1, 2, 3, 4 etc it will look like
http://www.ipvoid.com/scan/1/
then gets the IP
http://www.ipvoid.com/scan/2/
then gets the IP
etc until there are no longer values in the txt.

2) not sure if theres a more efficient way to do this since values are not repeated like in the video file i used as reference EG: the whole "foreach" statemnt might be irrelevant for this code

other thing i would like it to do is to export to a CSV but it is not as critical as the other thing smile.gif

I'm using this youtube video as reference:
https://www.youtube.com/watch?v=rru3G7PLVjw


*Note: im currently gathering several inputs not just 1. This is not school work, homework, nor a project, it just me learning and trying to use it for real life examples.
post #2 of 4
Hope this helps. My Regex is terrible =(
Code:
StreamReader reader = new StreamReader(File.OpenRead(txtFileLocation));
string line;
while((line = reader.ReadLine()) != null)
{
        WebClient web = new WebClient();
        string html = web.DownloadString ("http://www.ipvoid.com/scan/"+line +"/");
    //Slight change to ensure multiple instances will get caught.
        MatchCollection m1 = Regex.Matches(html, @"<tr><td>IP Address<\/td><td><strong>(.+?)<\/strong>");

        //This is a loop.  Basically, m1 comes back as a list for each time it matches this sequence.   
    // <tr><td>IP Address<\/td><td><strong>8.8.8.8<\/strong> is the first group (0)  , and 8.8.8.8 is the 2nd group (1)
    //If there is <tr><td>IP Address<\/td><td><strong>8.8.8.9<\/strong>, that would happen on the 2nd iteration of the foreach.
        foreach (Match m in m1)
        {
                string IP = m.Groups[1].Value;
                IPs.Add(IP);
        }
}
listBox1.DataSource = IPs;

Zev's Comp
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-2500K Sandy Bridge 3.3GHz GIGABYTE GA-Z68X-UD3H-B3 LGA 1155 Intel Z68 HDM... GeForce GTX 750 Ti G.SKILL Ripjaws X Series 8GB 
Hard DriveHard DriveHard DrivePower
1TB HDD 64GB SSD (Used for SRT) 500 GB. Antec BP550 Plus 550W Continuous Power ATX12V V... 
Case
COOLER MASTER ELITE 335 RC-335-KKN1-GP Black S... 
  hide details  
Reply
Zev's Comp
(15 items)
 
  
CPUMotherboardGraphicsRAM
Intel Core i5-2500K Sandy Bridge 3.3GHz GIGABYTE GA-Z68X-UD3H-B3 LGA 1155 Intel Z68 HDM... GeForce GTX 750 Ti G.SKILL Ripjaws X Series 8GB 
Hard DriveHard DriveHard DrivePower
1TB HDD 64GB SSD (Used for SRT) 500 GB. Antec BP550 Plus 550W Continuous Power ATX12V V... 
Case
COOLER MASTER ELITE 335 RC-335-KKN1-GP Black S... 
  hide details  
Reply
post #3 of 4
Thread Starter 
@mrzev

I'm going to give it a try. Sorry for the delay response I had changed the pq on my mail and forgot to update it on the phone.

I ended up installing Python to be honest. Since many of the help guides found where not working in C# .
post #4 of 4
I haven't really looked much at the Regex of the example above, but I do see a couple of things that shouldn't be left up to the garbage collector. Both StreamReader and WebClient implement IDisposable, this means that they should be destroyed when finished. The easiest way to do that with any object that requires disposal is the following.
Code:
using(var reader = new StreamReader("path"))
{
    //Code here
}


Even if you return something out of the using statement the object will still be disposed of properly.

The second thing I notice is that you are creating the webclient inside of the loop. That means that multiple instances of the object will be used which is not necessary. Since both can be disposed of at the same time the code then turns into:
Code:
using(var reader = new StreamReader("path"))
using(var webClient = new WebClient())
{
    //Code here
}

Notice that you do not have to nest the using statements, just stack them and use one set of braces.

The other simplification that I did was using "var" instead of the actual type. Using var is fine inside of a method. If you are declaring something globally you must use the type.


If you are interested in a slightly different method using some lambda expressions and File.ReadAllLines(). Check this out. The ReadAllLines relieves you of having to open and close a reader object. It returns an IEnumerable list of strings split on line breaks.I had to declare a couple of IEnumerable extensions. It is hard to stop using the "Each" once you get used to having it. We use it a ton here. In the example below I left in the standard foreach loop. That could have been replaced with items.Each(I => {}); For debugging purposes it is sometimes easier to keep the regular loop until your code is solid. The text file is just an IP address on a single line.
Code:
void Main()
{
        var IPs = new List<string>();
        var path = @"<path>source_ips.txt";
        var items = File.ReadAllLines(path);
        var re = new Regex(@"<tr><td>IP Address<\/td><td><strong>(.+?)<\/strong>");
        using(var webClient = new WebClient())
        {
                foreach (var l in items)
                {
                        var html = webClient.DownloadString(string.Format(@"http://www.ipvoid.com/scan/{0}/",l));
                        var m1 = re.Matches(html).Cast<Match>();
                        m1.Each(m => IPs.Add(m.Groups[1].Value));
                }
        }
}

public static class SystemCollectionsGenericIEnumerable
{
        public static void Each<T>(this IEnumerable<T> ie, Action<T> action)
        {
                if (action == null) throw new ArgumentNullException("action");
                Each(ie, (x, i) => action(x));
        }

        public static void Each<T>(this IEnumerable<T> ie, Action<T, int> action)
        {
                if (ie == null) return;
                if (action == null) throw new ArgumentNullException("action");

                var i = 0;
                foreach (var e in ie) action(e, i++);
        }
}

Edited by BFRD - 6/24/16 at 6:16am
Main Rig
(15 items)
 
  
Reply
Main Rig
(15 items)
 
  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Coding and Programming
Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › C# help with code im working on (read from external file weblink hybrid)