Retrieve a list of words from a website and show a word count plus a specified number of most frequently occurring words
I have to:
1.Retrieve the document text from the web (provided by utility class)
2.Filter the desired "words" form the document, and one by one, store each word as a key into a Map<String,Integer> object where the value is the number of occurrences of the word
3. Read the (word, num_occurrences) map entry pairs into an array/list structure of your choice
4. sort pairlist in a manner which sorts by num_occurrences
5. print: the total number of words processed, the number of unique words, the N pairs which have the largest number of occurrences.
Here's what I have so far -- The first class is the WebDoc utility class and the second is the main class. I have added blocks of commented out sections in which the new code should go. please help!
package util;
import javax.swing.text.MutableAttributeSet;
import javax.swing.text.html.HTMLEditorKit;
import javax.swing.text.html.HTML;
import java.io.InputStreamReader;
import java.io.IOException;
import java.net.URL;
import java.net.MalformedURLException;
public class WebDoc {
public static String getBodyContent(String urlstr)
throws MalformedURLException, IOException {
/*
* The following convoluted code is necessary because getParser()
* is a protected method in HTMLEditorKit.
* We create an anonymous extension of HTMLEditorKit with a public
* getParser method calling the protected method of the superclass.
*/
HTMLEditorKit.Parser parser = new HTMLEditorKit() {
@Override
public HTMLEditorKit.Parser getParser() {
return super.getParser();
}
}.getParser();
class DocStatus {
public String content = "";
public boolean body_started = false;
}
final DocStatus status = new DocStatus();
HTMLEditorKit.ParserCallback callback = new HTMLEditorKit.ParserCallback() {
// handle the tags: look for the BODY tag
@Override
public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos) {
if (t == HTML.Tag.BODY) {
status.body_started = true;
}
}
// handle the text between tags: concatenate all text after BODY tag
@Override
public void handleText(char[] text, int position) {
if (status.body_started) {
status.content += String.valueOf(text) + " ";
}
}
};
URL url = new URL(urlstr);
InputStreamReader r = new InputStreamReader(url.openStream());
parser.parse(r, callback, true);
return status.content;
}
}
package dsprog3;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import util.WebDoc;
public class DSProg3 {
public static void main(String[] args) {
String url;
//test URLs
url = "http://en.wikipedia.org/wiki/Jimi_Hendrix";
final int N = 25; //the number of word/frequency pairs to print
//word pattern recognizes a string of 5 or more letters
String word_pattern = "[A-Za-z]{5,}";
String content = null;
try {
content = WebDoc.getBodyContent(url); // get body of the web document
} catch (Exception ex) {
ex.printStackTrace();
System.exit(1);
}
Map<String,Integer> wordCount = new HashMap<String,Integer>();
int total_words = 0;
Matcher match = Pattern.compile(word_pattern).matcher(content);
while(match.find()){
++total_words;
//get the next word which matches the word_pattern
//and normalize it by making it lower case
String word = match.group().toLowerCase();
//System.out.println(word); //use this for testing
/**ADD CODE
*
* "register" one more occurrence of key, word, in the wordCount map
*/
}
//System.out.println(wordCount); //use this for testing
//use this class as is or modify it
class WordPair {
String word;
Integer count; // number of occurrences
WordPair(String word, Integer count) {
this.word = word;
this.count = count;
}
}
/**ADD CODE
*
* Create an array/list structure to hold WordPair objects
* Iterate through wordCount and store the Map entry pairs
* into the array/list structure
*/
/**ADD CODE
*
* Create a comparator for WordPair objects which compares by
* the count component
*
* Then sort the array/list using this comparator
*/
/**ADD CODE
*
* Print
* total_words
* # of unique words
* the N entries in the array/list corresponding to the
* pairs with the highest count values
*/
}
}
View Answers
Related Tutorials/Questions & Answers:
Find number of words begin with the specified characterCount number of
words begin with the
specified character
In this section, you will learn how to
count the
number of
words that begin with the
specified... with the input character. If found, counter will
count the
words.
Here is the code
Advertisements
Count words in a string method? count the length of arr which return the
number of
words in the inputted string...
Count words in a string method? How do you
Count words in a string...(" ");
System.out.println("
Number of
words : "+arr.length);
}catch(IOException e
Java count words from fileJava
count words from file
In this section, you will learn how to determine the
number of
words present
in the file.
Explanation:
Java has provides several... by using the StringTokenizer class, we can easily
count the
number of
words Java Word Count - Word Count Example in Java to
count the
number of
lines,
number of
words and
number of characters... some strings and program will
count the
number of characters and
number of
words... of lines,
number of
words and
number of
characters in the
specified file. We
JavaScript Count WordsJavaScript
Count Words
In this section, you will learn how to
count words... and using the regular
expression, determine the
number of
words and finally display... that will display the
number of the
words in the textbox as the user enter the
words how to count words in string using java++;
}
System.out.println("
Number of
words are: "+
count);
}
}
Thanks
Hello...how to
count words in string using java how to
count words in string...
count=0;
String arr[]=st.split(" ");
System.out.println("
Number from number to wordfrom number to word i want to know weather there is any method that can be use in changing value
from number to
word. Example if i write ten thousand, it will automatically be written as 10000.
Java convert
number Java Convert Number to Words;
}
public void pass(int
number) {
int
word,
q;
if (
number < 10) {
show(st1...:
word =
number % 10;
if (
word != 0) {
show(" ");
show(st2[0]);
show(" ");
pass(
word);
}
number /= 10;
break;
case
Convert Number To Words
Convert
Number To
Words
In this example, We are going to convert
number to
words.
Code...) . It displays the string
representing the
number.
Here is the code of this program
Java count frequency of words in the stringJava
count frequency of
words in the string.
In this tutorial, you will learn how to
count the occurrence of each
word in
the given string.
String...)))
i++;
String
word = str.substring(
count + 1, i);
if (map.containsKey(
word Java Convert date to words;);
}
number /= 100;
break;
case 2:
word =
number % 10;
if (
word != 0) {
show(" ");
show(st2[0]);
show(" ");
pass(
word);
}
number /= 10;
break;
case 3:
word =
number % 100;
if (
word != 0) {
show("
Count repetitions of every word from an input fileCount repetitions of every
word from an input file Hello..i got to know how can i
count the repetitions of every
word present in a specific input... recorded i need to
count only the url patterns like google,yahoo etc,
plz help me
JavaScript split string into words..
Str.split(" ",3)-split in
word and return first 3
words...JavaScript split string into
words. How to split string into
words..., used to specify the
number of splits.
str.split() ? it returns
Breaking the String into Words into separate
words.
This program takes a string
from user and breaks... between the
words. This program also counts the
number of
words present
in the string...
Breaking the String into
Words
retrieve record from table and show it in HTMLretrieve record
from table and
show it in HTML Hi. I have a field...,trichy,kanchipuram for a single record. I have to
retrieve these data
from... as single values like chennai as one value, trichy as one value. and i have to
show Count instances of each wordCount instances of each word I am working on a Java Project that reads a text file
from the command line and outputs an alphabetical listing of the
words preceded by the occurrence
count. My program compiles and runs
Searching English words in a stringSearching English
words in a string My task is to find English
words and separate them
from a string with concatenated
words..for example
AhdgdjHOWAREgshshYOUshdhfh
I need to find if there exists any English
words.
ModuleNotFoundError: No module named 'words'ModuleNotFoundError: No module named '
words' Hi,
My Python program is throwing following error:
ModuleNotFoundError: No module named '
words'
How to remove the ModuleNotFoundError: No module named '
words'
Display non-duplicate words from fileDisplay non-duplicate
words from file
In this tutorial, you will learn how to read a text file and display
non-duplicate
words in ascending order. The given... the
list elements which are
actually the non-duplicate
words.
data.txt:
Where
tO FIND UNIQUE WORDS IN A FILE USING HASHMAP(" ");
// intialize an int array to hold
count of each
word
counter= new int... and their counter
// the
word being the key and the
number of occurences is the value...
count of each
word)
System.out.println(map.get(temp.toString
Arrange the sentences in alphabetical order of words JavaArrange the sentences in alphabetical order of
words In Java Program
In this section, we are going to sort the order of
words in all the
specified sentences. As the
specified text consists of sentences
that are terminated by either
Count number of "*"Count number of "*" I have this code to
count the
number of *
from a string entered. but I need to find it
from an text file. Any idea?
import...:");
String text = bf.readLine();
int
count = 0;
for (int i = 0; i
Reserved words in R ProgrammingReserved
Words of R Programming Language
Every programming language reserves certain
words and it can be used in
making variables in the Programming language source code. In this tutorial we
are going to explore the reserved key
words Word Count
Word Count
This example counts the
number of occurrences of
a specific
word in
a string. Here we are counting the occurrences of
word "you" in a string
CONVERT VALUE MONEY TO WORDS IN SQL?CONVERT VALUE MONEY TO
WORDS IN SQL? i want to covert money or varchar value (like 7500000 ) in
words like (75 lacs)
then how to convert this value in this
words .
please give me solution
Count Palindromes from the string++;
}
}
System.out.println("
Number of Palindromes in the
specified string: "
+
count...
Count Palindromes
from the string
In this section, we are going to find out the
number of palindromes
from the string. For this, we have allowed the user
ModuleNotFoundError: No module named 'co-words'ModuleNotFoundError: No module named 'co-
words' Hi,
My Python...-
words'
How to remove the ModuleNotFoundError: No module named 'co-
words... to install padas library.
You can install co-
words python with following