在Java中使用hashmap進行單詞串列搜索-有解無憂

我有一個單詞串列，我的單詞串列中有超過 50,000 個單詞。如您所見，我讀取了我的單詞并將它們添加到一個陣列串列中，但是在此程序之后，當我想讀取我的單詞時，它發生得非常緩慢。這就是為什么Hashmap我想到了。我想閱讀我的文字，當我收到用戶輸入的文字時，我想讓它檢查它是否在HashMap. 即使我做了研究，我也找不到確切的方法。我怎樣才能做到這一點？

  public ArrayList<String> wordReader () throws FileNotFoundException {
        File txt = new File(path);
        Scanner scanner = new Scanner(txt);
        ArrayList <String> words = new ArrayList<String>();
        while (scanner.hasNextLine()) {
            String data = scanner.nextLine();
            words.add(data);
        }
        scanner.close();
        return words;
    }

uj5u.com熱心網友回復：

如果我正確理解了您的問題，那么ArrayList當您嘗試檢查串列中是否存在特定單詞時，您在遍歷充滿 50.000 個單詞時會遇到性能問題。

這是因為在未排序的元素中查找元素List具有O(n)復雜度。您可以通過使用像 BST（二叉搜索樹）這樣的排序資料結構來提高性能，這將提高O(log n)復雜度的研究操作。

此外，您使用 a 的想法Map絕對是可行的，因為 a為O(1)（理論上完美的散列演算法，鍵之間完全沒有沖突）和O(n)（對于糟糕的散列）HashMap之間的添加和獲取操作賦予了復雜性碰撞概率高的演算法）。此外，從Java 8開始，在實作中引入了優化，在多個元素添加到同一個桶的高沖突情況下，桶對應的資料結構實際上是實作為平衡樹而不是串列，授予最壞情況下的O(log n)復雜度。HashMap

https://www.logicbig.com/tutorials/core-java-tutorial/java-collections/java-map-cheatsheet.html

但是，HashMap對于我假設的字典（僅不同的單詞）使用 a 可能是不必要的，因為您將使用一個單詞作為鍵和值。代替 a HashMap，您可以使用Set其他人指出的 a ，或者更好的 a HashSet。事實上，aHashSet是通過底層HashMap實體實作的，這將為我們提供前面討論過的所有性能和優勢（這就是我寫前言的原因）。

https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/HashSet.html

您的實作可能如下所示：

public Set<String> wordReader(String path) throws FileNotFoundException {
    File txt = new File(path);
    Scanner scanner = new Scanner(txt);
    Set<String> words = new HashSet<>();
    while (scanner.hasNextLine()) {
        String data = scanner.nextLine();
        words.add(data);
    }
    scanner.close();
    return words;
}

public boolean isWordContained(Set<String> set, String word) {
    return set.contains(word);
}

uj5u.com熱心網友回復：

由于您將檢查從檔案中讀取的單詞串列中是否存在輸入的單詞，因此您可以使用 aHashSet<String>而不是使用ArrayList<String>.

你的方法會變成

public HashSet<String> wordReader () throws FileNotFoundException {
        File txt = new File(path);
        Scanner scanner = new Scanner(txt);
        HashSet <String> words = new HashSet<String>();
        while (scanner.hasNextLine()) {
            String data = scanner.nextLine();
            words.add(data);
        }
        scanner.close();
        return words;
    }

現在，在您閱讀單詞輸入后，您可以檢查它是否存在于HashSet. 這將是一個更快的操作，因為查找將花費恒定的時間。

public boolean isWordPresent(String word, HashMap<String> words){
    return words.contains(word);
}

作為旁注，HashSet內部使用 aHashMap來執行操作。

uj5u.com熱心網友回復：

我會使用 a Set，而不是 aList因為當您將它們添加到集合時，集合會自動忽略重復項。如果它不存在，則回傳 true 并添加它，否則回傳 false。

public Set<String> wordReader () throws FileNotFoundException {
        File txt = new File(path);
        Scanner scanner = new Scanner(txt);
        Set <String> words = new HashSet<>();
        while (scanner.hasNextLine()) {
            String data = scanner.nextLine();
            if(!words.add(data)) {
               // present - Do something
            } 
         }   
        
        scanner.close();
        return words;
}

因為集合不是有序的，它們不是隨機訪問集合。因此，您可以將集合添加到串列中，如下所示：

Set<String> words = wordReader();
List<String> wordList = new ArrayList<>(words);

現在您可以使用索引檢索它們。

您可能希望通過將檔案名作為引數傳遞來使您的方法更加通用。

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/486309.html

標籤：爪哇算法搜索哈希图

上一篇：如何計算`sin(a^b)`其中`a^b`可能是一個非常大的雙精度數？

下一篇：在時間范圍python串列中查找公共時隙