Como: Combinar consultas do LINQ com expressões regulares

Este exemplo mostra como usar o Regex classe para criar uma expressão regular para correspondência mais complexas em cadeias de caracteres de texto. O LINQ consulta facilita para o filtro de exatamente os arquivos que você deseja pesquisar com a expressão regular e a forma os resultados.

Exemplo

Class LinqRegExVB

    Shared Sub Main()

        ' Root folder to query, along with all subfolders.
        ' Modify this path as necessary so that it accesses your Visual Studio folder.
        Dim startFolder As String = "C:\program files\Microsoft Visual Studio 9.0\"
        ' One of the following paths may be more appropriate on your computer.
        'string startFolder = @"c:\program files (x86)\Microsoft Visual Studio 9.0\";
        'string startFolder = @"c:\program files\Microsoft Visual Studio 10.0\";
        'string startFolder = @"c:\program files (x86)\Microsoft Visual Studio 10.0\";

        ' Take a snapshot of the file system.
        Dim fileList As IEnumerable(Of System.IO.FileInfo) = GetFiles(startFolder)

        ' Create a regular expression to find all things "Visual".
        Dim searchTerm As System.Text.RegularExpressions.Regex = 
            New System.Text.RegularExpressions.Regex("Visual (Basic|C#|C\+\+|J#|SourceSafe|Studio)")

        ' Search the contents of each .htm file.
        ' Remove the where clause to find even more matches!
        ' This query produces a list of files where a match
        ' was found, and a list of the matches in that file.
        ' Note: Explicit typing of "Match" in select clause.
        ' This is required because MatchCollection is not a 
        ' generic IEnumerable collection.
        Dim queryMatchingFiles = From afile In fileList
                                Where afile.Extension = ".htm"
                                Let fileText = System.IO.File.ReadAllText(afile.FullName)
                                Let matches = searchTerm.Matches(fileText)
                                Where (matches.Count > 0)
                                Select Name = afile.FullName,
                                       Matches = From match As System.Text.RegularExpressions.Match In matches
                                                 Select match.Value

        ' Execute the query.
        Console.WriteLine("The term " & searchTerm.ToString() & " was found in:")

        For Each fileMatches In queryMatchingFiles
            ' Trim the path a bit, then write 
            ' the file name in which a match was found.
            Dim s = fileMatches.Name.Substring(startFolder.Length - 1)
            Console.WriteLine(s)

            ' For this file, write out all the matching strings
            For Each match In fileMatches.Matches
                Console.WriteLine("  " + match)
            Next
        Next

        ' Keep the console window open in debug mode
        Console.WriteLine("Press any key to exit")
        Console.ReadKey()
    End Sub

    ' Function to retrieve a list of files. Note that this is a copy
    ' of the file information.
    Shared Function GetFiles(ByVal root As String) As IEnumerable(Of System.IO.FileInfo)
        Return From file In My.Computer.FileSystem.GetFiles(
                   root, FileIO.SearchOption.SearchAllSubDirectories, "*.*") 
               Select New System.IO.FileInfo(file)
    End Function

End Class
class QueryWithRegEx
{
    public static void Main()
    {
        // Modify this path as necessary so that it accesses your version of Visual Studio.
        string startFolder = @"c:\program files\Microsoft Visual Studio 9.0\";
        // One of the following paths may be more appropriate on your computer.
        //string startFolder = @"c:\program files (x86)\Microsoft Visual Studio 9.0\";
        //string startFolder = @"c:\program files\Microsoft Visual Studio 10.0\";
        //string startFolder = @"c:\program files (x86)\Microsoft Visual Studio 10.0\";

        // Take a snapshot of the file system.
        IEnumerable<System.IO.FileInfo> fileList = GetFiles(startFolder);

        // Create the regular expression to find all things "Visual".
        System.Text.RegularExpressions.Regex searchTerm =
            new System.Text.RegularExpressions.Regex(@"Visual (Basic|C#|C\+\+|J#|SourceSafe|Studio)");

        // Search the contents of each .htm file.
        // Remove the where clause to find even more matchedValues!
        // This query produces a list of files where a match
        // was found, and a list of the matchedValues in that file.
        // Note: Explicit typing of "Match" in select clause.
        // This is required because MatchCollection is not a 
        // generic IEnumerable collection.
        var queryMatchingFiles =
            from file in fileList
            where file.Extension == ".htm"
            let fileText = System.IO.File.ReadAllText(file.FullName)
            let matches = searchTerm.Matches(fileText)
            where matches.Count > 0
            select new
            {
                name = file.FullName,
                matchedValues = from System.Text.RegularExpressions.Match match in matches
                                select match.Value
            };

        // Execute the query.
        Console.WriteLine("The term \"{0}\" was found in:", searchTerm.ToString());

        foreach (var v in queryMatchingFiles)
        {
            // Trim the path a bit, then write 
            // the file name in which a match was found.
            string s = v.name.Substring(startFolder.Length - 1);
            Console.WriteLine(s);

            // For this file, write out all the matching strings
            foreach (var v2 in v.matchedValues)
            {
                Console.WriteLine("  " + v2);
            }
        }

        // Keep the console window open in debug mode
        Console.WriteLine("Press any key to exit");
        Console.ReadKey();
    }

    // This method assumes that the application has discovery 
    // permissions for all folders under the specified path.
    static IEnumerable<System.IO.FileInfo> GetFiles(string path)
    {
        if (!System.IO.Directory.Exists(path))
            throw new System.IO.DirectoryNotFoundException();

        string[] fileNames = null;
        List<System.IO.FileInfo> files = new List<System.IO.FileInfo>();

        fileNames = System.IO.Directory.GetFiles(path, "*.*", System.IO.SearchOption.AllDirectories);
        foreach (string name in fileNames)
        {
            files.Add(new System.IO.FileInfo(name));
        }
        return files;
    }
}

Observação Você também pode consultar a MatchCollection objeto retornado por uma RegEx de pesquisa. Neste exemplo, somente o valor de cada correspondência é produzido nos resultados. No entanto, também é possível usar LINQ para executar todos os tipos de filtragem, classificação e agrupamento na coleção. Porque MatchCollection é não genérico IEnumerable coleção, você deve declarar explicitamente o tipo da variável de intervalo em uma consulta.

Compilando o código

  • Criar um Visual Studio o projeto que se destina a .NET Framework versão 3.5. Por padrão, o projeto tem uma referência a System.Core.dll e um using diretiva (C#) ou Imports instrução (Visual Basic) para o namespace System. LINQ. No C# projetos, adicione um using a diretiva para o namespace System. IO.

  • Copie este código em seu projeto.

  • Pressione F5 para compilar e executar o programa.

  • Pressione qualquer tecla para sair da janela do console.

Consulte também

Tarefas

How to: Generate XML from CSV Files

Conceitos

LINQ e seqüências de caracteres

LINQ e os diretórios de arquivos